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ABSTRACT 


/l 

This  report  discusses  the  role  of  coherence  considerations 
In  the  definition  and  measurement  of  subjective  probability. 
A  general  version  of  De  Flnetti's  coherence  theorem — that 
either  a  set  of  betting  probabilities  obeys  the  laws  of 
probability  or  else  a  sure  win  is  possible  for  the  bettor — 
is  proved,  using  a  variant  of  Farka's  Lemma.  This  theorem 
provides  the  basis  for  several  admissibility  theorems  for 
scoring-rule  probabilities,  under  a  generalization  of 
scoring  rules  suggested  by  Lindley.  Linear  programming 
methods  for  identifying  and  reconciling  incoherence  are 
discussed,  and  a  comparison  is  made  with  Bayesian  reconcil¬ 
iation  methods.  . 


Coherent  Assessment  of  Subjective  Probability 


by  Robert  F.  Nau 


1.  Introduction 

The  purpose  of  this  report  is  to  aggregate  and  generalize  some  well-known  results  of  de 
Finetti  (1937,  1972,  1974),  Smith  (1961),  and  Savage  (1971)  and  some  recent  results  of  Lind- 
ley  (1980)  concerning  the  use  of  betting  systems  and  scoring  rules  for  eliciting  subjective  pro¬ 
babilities,  and  to  discuss  methods  for  identifying  and  reconciling  incoherence.  The  principal 
analytic  tool  will  be  a  separating-hyperplane  theorem  of  linear  algebra,  which,  together  with  its 
variants  and  extensions,  has  previously  been  applied  by  numerous  authors  to  discussions  of 
coherence  and  admissibility  in  statistical  inference  and  decision.  (E.g..  Blackwell  and  Girshick 
(1954),  Smith  (1961),  Cornfield  (1969),  Freedman  and  Purves  (1969),  Dawid  and  Stone 
(1972,  1973),  Heath  and  Sudderth  (1972,  1978),  Pierce  (1973),  and  Buehler  (1976)).  The 
central  problem  discussed  here  is  the  elicitation  of  subjective  conditional  probabilities  for  a  set 
of  events  which  are  subsets  of  a  finite  sample  space,  with  conditional  probabilities  directly 
defined  in  terms  of  "called-off  bets,”  rather  than  as  ratios  of  unconditional  probabilities.  Coher¬ 
ence  (of  betting  probabilities)  and  admissibility  (of  scoring-rule  probabilities)  are  defined  in 
terms  of  avoiding  unnecessary  certain  loss  under  all  outcomes  in  the  sample  space,  and  are 
shown  to  be  equivalent  criteria  for  defining  and  measuring  subjective  probability.  A  general 
version  of  de  Fmetti's  coherence  theorem,  stated  in  terms  of  lower  and  upper  conditional  bet¬ 
ting  probabilities,  is  proved  using  the  separating-hyperplane  theorem  below.  The  coherence 
theorem  provides  a  basis  for  several  admissibility  theorems  for  scoring-rule  probabilities,  under 
a  generalization  of  scoring  rules  suggested  by  Lindley  (1980).  Throughout,  emphasis  is  placed 
on  the  distinction  between  strict  and  non-strict  forms  of  coherence  and  admissibility, 


illuminating  the  role  of  zero  probabilities  in  subjectivistic  theory.  The  construction  of  a  proba¬ 
bility  measure  consistent  with  a  tiven  set  of  betting  probabilities,  whose  existence  is  required 
for  coherence,  is  shown  to  be  a  simple  linear  programming  problem,  whose  dual  is  the  search 
for  a  combination  of  bets  providing  a  ’sure  win.’  The  geometric  interpretation  of  coherence 
suggests  linear  programming  methods  for  improving  the  precision  of  probability  assessments 
and  reconciling  incoherence,  using  lower  and  upper  probabilities  to  characterize  imprecise  initial 
assessments.  These  methods  are  shown  to  provide  a  computationally  simpler  alternative  to  the 
Bayesian  reconciliation  methods  of  Lindley,  Tversky,  and  Brown  (1979). 

The  various  so-called  separating-hyperplane  theorems  can  all  be  derived  from  a  'basic 
separation  theorem”  for  linear  spaces  (Dunford  and  Schwarz  (1958),  p.  412),  which  states  that 
any  two  disjoint  convex  sets  (say,  X  and  JO,  one  of  which  has  an  interior  point,  can  be 
separated  by  a  non-triviai  linear  functional-i.e.  there  exists  a  linear  functional  /,  not  identically 
zero,  and  a  real  number  d  such  that  /fcl/Tx)]  <  d  for  all  x  in  X  and  &el/(y)]  >  d  for  all  y 
in  Y.  If  the  two  sets  are  also  closed,  then  the  separation  can  be  made  strict  (i.e..  strict  inequal¬ 
ity  can  be  obtained  in  at  least  one  of  the  above  relations).  In  Unite-dimensional  Euclidean 
space  the  linear  functional  takes  the  form  /(x)-zx  where  z  is  a  fixed  vector,  with  the 
geometric  interpretation  that  X  and  Y  are  separated  by  a  hyperplane  whose  normal  direction  is 
z  and  whose  distance  from  the  origin  is  d.  The  following  conventions  and  notation  for  vector 
inequalities  will  be  useful:  a  vector  x  is  nonnegative  (’x  ^0")  if  all  of  its  components  are  non¬ 
negative:  x  is  semi-positive  (*x  >  0*)  if  it  is  nonnegative  and  not  the  zero  vector:  and  x  is  posi¬ 
tive  Cx  >  O')  if  all  of  its  components  are  positive,  where  0  denotes  the  zero  vector  of  appropri¬ 
ate  length.  Corresponding  definitions  and  notation  apply  to  non-positive,  semi-negative ,  and 
negative  vectors.  In  these  terms,  the  theorem  for  later  use  is: 

THEOREM  1.  Exactly  one  of  the  following  two  systems  has  a  solution: 
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(ii)  w'A  £0,  w  >  0  (w  >  0),  £wy  —  1 

j 

where  A  is  a  matrix  and  z  and  w  are  vectors  of  appropriate  length. 

Proof:  For  the  unbracketed  case,  obviously  both  systems  cannot  simultaneously  have 
solutions.  Then  either  the  non-negative  orthant  contains  a  point  of  the  dosed  convex 
hull  of  the  row  vectors  of  A,  in  which  case  («)  has  a  solution,  or  else  there  exists  a 
hyperplane  which  strictly  separates  these  two  closed  convex  sets.  The  normal  direction  of 
this  hyperplane  constitutes  a  solution  to  (/).  For  the  bracketed  case,  again  both  systems 
cannot  simultaneously  have  solutions.  Then  either  the  non-negative  orthant  contains  a 
point  of  the  open  convex  cone  of  the  row  vectors  of  A,  in  which  case  a  solution  to  («)  is 
obtained  by  normalization,  or  else  for  some  j  the  following  system  must  have  no  solution: 
w'A  +  a/  %  0,  w  >  0,  where  a/  denotes  the  /h  row  vector  of  A.  (If  this  system  had  a 
solution  for  every  j,  then  their  sum  plus  the  vector  whose  components  are  all  l's  would 
constitute  a  solution  to  (/>)  following  normalization.)  For  some  J,  then,  the  non-negative 
orthant  has  no  point  in  common  with  the  closed  convex  set  formed  by  the  direct  sum  of 
and  the  closed  convex  cone  of  ail  the  row  vectors  of  A,  so  that  these  two  sets  are 
strictly  separated  by  some  hyperplane.  The  normal  direction  of  this  hyperplane  then  con¬ 
stitutes  a  solution  to  (/),  in  which  the  /h  element  of  Az  is  negative. 

In  applications,  a  vector  w  satisfying  (//)  will  be  considered  to  represent  a  probability  distribu¬ 
tion.  The  following  corollary  is  closely  related  to  Farka’s  Lemma  ("either  Az  <0,  c’z>0  has  a 
solution,  or  else  w'A-c,  w  £  0  has  a  solution"),  which  is  the  basis  of  the  duality  theorem  of 
linear  programming. 

COROLLARY:  Exactly  one  of  the  following  two  systems  has  a  solution: 

(»)  Az  <  0  [Az  <  0) 

(«)  w’A— 0,  w  >  0  [w  >  0),  -  1 

) 

This  follows  by  applying  Theorem  1  to  the  matrix  [A|-Al. 


2.  Coherence  for  the  fair  bookie 

This  section  deals  with  the  elicitation  of  subjective  probabilities  under  a  betting  system- 
that  is,  a  framework  in  which  a  transaction  takes  place  between  a  bookie  and  a  bettor.  De 
Finetti’s  (1937)  well-known  theorem  on  coherence-that  either  a  bookie's  betting  probabilities 
fbet  prices")  obey  the  laws  of  probability  or  else  a  sure  win  is  possible  for  the  bettor-is  proved 
for  the  general  case  of  conditional  bets  on  a  finite  number  of  events,  and  the  geometrical 
interpretation  of  coherence  is  discussed.  A  further  generalization  of  the  coherence  theorem  to 
incorporate  lower  and  upper  probabilities  is  given  in  a  later  section. 

Consider  a  bookie,  a  bettor,  and  n  pairs  of  events:  (£.£),  /— The  bookie  must 
establish  prices  ("set  the  odds")  for  bets  on  £  conditional  on  F,  (’£,  given  £")  for  all  /,  and 
the  bettor  may  then  place  any  combination  of  bets.  Following  de  Fmetti's  convention,  capital 
letters  such  as  £  and  F  will  be  used  interchangeably  as  names  for  events  and  also  as  the  indica¬ 
tor  variables  for  the  same  events-  e.g.,  "£-l"  is  interchangeable  with  ’£  is  true",  and  "l-£"  is 
interchangeable  with  £  ("not-£").  Let  the  transaction  be  described  as  follows:  first  the  bookie 
chooses  a  vector  p-(pi,  . . .  ,p„),  where  p,  is  his  price  for  buying  or  selling  a  "unit  bet"  on  £ 
given  £~  i.e.,  a  lottery  which  pays  1  unit  if  ££— 1,  pays  zero  if  (l-£)£-l.  and  pays  back 
the  purchase  price  (in  which  case  the  bet  is  considered  "called  ofT)  if  £-0.  (The  bookie  is 
"fair"  in  the  sense  that  he  buys  and  sells  at  the  same  price.  The  more  general  case  of  unequal 
buying  and  selling  prices  is  discussed  in  a  later  section.)  The  bettor  then  chooses  a  vector 

z—  (ri . z„)  where  |r,|  is  the  number  of  unit  bets  on  £  given  F,  that  he  wishes  to  buy  (if 

r, >0)  or  sell  (if  r,  <0).  The  net  gain  to  the  bookie  for  the  bet  on  the  /m  event  pair  in  all  cases 
is  given  by  the  expression  (p,—E,)F,z,,  which  may  be  positive,  negative,  or  zero.  In  conven¬ 
tional  betting  parlance,  the  bookie  is  said  to  have  offered  "odds  of  (1  -p,)  to  p,  against  £’  and 
reciprocal  odds  "on"  £;  the  bettor  has  placed  a  stake  of  \p,z,\  "on"  £  if  z,  >0  or  "against"  £  if 
z,  <  0-all  conditional  on  F,. 


It  is  assumed  that  the  £'s  and  F's  are  subsets  of  a  sample  space,  6,  consisting  of  m 
mutually  exclusive  and  collectively  exhaustive  outcomes  which  are  known  to  both  the  bookie 
and  the  bettor.  That  is,  both  participants  are  aware  of  ail  logical  dependencies  among  the  In 
events  of  interest,  which  place  restrictions  on  their  possible  joint  realizations,  since  every  possi¬ 
ble  joint  realization  must  correspond  to  at  least  one  outcome  in  the  sample  space.  Let  0, 
denote  the  /"  element  of  9,  and  also  the  event  consisting  of  only  that  outcome.  Let  £,,  and 
F,j  denote  the  values  of  £,  and  F,  under  outcome  j—  i.e.,  £r/-l  if  £  contains  9J%  and  £,,- 0 
otherwise.  Then  the  total  net  gain  to  the  bookie  for  all  n  bets  when  0y  obtains  is 

tj (z;p)  -  2.{p,-E,j)F,jZ,  .  (2.1) 

i-i 

The  ’payoff  vector*  for  the  bookie,  t(z;p),  is  now  defined  as  the  m -vector  whose  /h  element  is 
r;(z.p). 

DEFINITION:  The  vector  of  prices  p  is  [strictly]  coherent  for  the  bookie  if-and-only-if 
there  does  not  exist  any  vector  of  bets  z  for  which  the  resulting  payoff  vector  is  [semi-] 
negative. 

In  other  words,  the  bookie's  prices  are  coherent  if  there  is  no  ’sure-win’  bet  for  the  bettor  (one 
for  which  the  bookie  loses  money  under  every  outcome),  and  strictly  coherent  if  there  is  no 
*can't-lose*  bet  (one  for  which  the  bookie  loses  under  at  least  one  outcome,  and  wins  under 
none).  Necessary  conditions  for  coherence  or  strict  coherence  in  certain  cases  can  be  immedi¬ 
ately  identified.  For  example,  if  £-0  is  impossible,  then  coherence  requires  0  <  p,  <  1.  since 
choosing  r,>0  if  p,  <0,  or  r,  <0  if  p,  >1,  would  produce  a  sure  win  for  the  bettor.  Similarly,  if 
both  £,£,-1  and  (l-£,)£-l  are  possible  then  strict  coherence  requires  0 < p, <  1 .  On  the 
other  hand,  if  £,£“1  is  possible  but  (!-£,)£— 1  is  not  (or  vice  versa),  then  strict  coherence 
requires  p,- 1  (or  p,- 0).  In  general,  the  necessary  and  sufficient  conditions  for  coherence  or 
strict  coherence  are  given  by: 

THEOREM  2.  p  is  (strictly]  coherent  if-and-only-if  there  exists  a  [positive]  probability 
distribution  w  on  9,  and  a  corresponding  probability  measure  P,  on  all  subsets  of  9.  such 
that  for  every  /  either  p,-P.(£,|£)  or  else  P„(£)-0. 


Proof:  Let  A  be  the  mxn  matrix  whose  0'./)lh  element  is  (p,-E,,)FIJ.  Then  t(r;p)“Az. 
By  the  Corollary  to  Theorem  1,  Az<0  {Ai  <  OJ  has  no  solution  if-and-only-if  there 
exists  w  >  0  (w  >  01  such  that  w'A— 0.  This  vector  equality  is  equivalent  to: 


£(/*-£,)/>, -0, /-l n  (2.2) 

j- i 

whence  either 

ZF.jWj-  0,  (2.3) 

j- 1 

or  else 


] _ 

iF‘iwi 

j- 1 


(2.4) 


Let  P„  be  the  unique,  finitely  additive  probability  measure  on  all  subsets  of  6  which 
satisfies  P „(0,)-w;  for  all  j.  That  is. 


p.<F)  *  IF,*,  {2  S) 

j- 1 

and  similarly  for  all  other  subsets  of  the  sample  space.  Define  the  conditional  probability 
of  £,  given  F,  in  the  usual  way  as 

o  <r\c\  _  ^w(EiF,)  . 

Pw(£,lF,)  _  ?Jf  )  .  (2.6) 

Substitution  of  (2.5)  and  (2.6)  into  (2.3)  and  (2.4)  completes  the  proof. 

This  theorem  provides  the  motivation  for  de  Finetti’s  definition  of  subjective  probabilities  as 
coherent  bet  prices.  From  the  definition  of  the  probability  measure  P.  in  (2.5)  and  (2.6),  it 
follows  that  the  quantities  P.(£,jF,),/-l . n,  obey  the  usual  "laws*  of  probability,  includ¬ 

ing  the  additive  and  multiplicative  laws.  (This  will  be  illustrated  in  the  geometrical  examples 
below.)  Theorem  2  implies  that,  by  conformity  with  some  such  measure,  coherent  bet  prices 
obey  the  same  laws  merely  through  the  fact  of  being  coherent,  rather  than  by  prior  assumption. 
Thus,  if  coherence  is  taken  as  an  axiom  of  subjective  probability,  the  probability  laws  which  are 
traditionally  stated  as  axioms  or  definitions  are  obtained  instead  as  theorems.  (De  Finetti 


a 
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(1937)  gives  separate  proofs  of  the  ’total  probability  theorem,*  or  additive  law,  and  the  'com¬ 
pound  probability  theorem,”  or  multiplicative  law,  based  on  evalution  of  the  determinant  of  A 
in  particular  cases.)  It  is  important  to  note  that,  in  this  approach,  conditional  probability,  directly 
defined  through  the  device  of  called-off  bets,  is  the  fundamental  notion.  Unconditional  proba¬ 
bility  is  obtained  as  a  special  case  when  the  conditioning  event  happens  to  be  the  certain  event- 
•i.e.,  the  whole  sample  space.  This  is  in  contrast  to  the  conventional  approach  to  probability 
theory,  which  begins  with  a  definition  of  unconditional  probabilities,  and  then  derives  condi¬ 
tional  probabilities  according  to  (2.6). 

The  distribution  w  satisfying  w'A-0,  whose  existence  is  required  for  p  to  be  coherent, 
need  not  be  unique.  Let  W(p)  denote  the  closed,  convex  set  consisting  of  all  such  w.  Given 
any  w  in  H'(p),  the  probability  measure  P.  is  defined  for  all  subsets  of  the  sample  space,  not 
merely  the  In  events  initially  considered.  This  provides  a  basis  for  inferences  about  the  possi¬ 
ble  coherent  values  for  bet  prices  on  further  pairs  of  events  which  are  subsets  of  the  same  sam¬ 
ple  space.  Let  £„*|  and  F„+1  denote  such  a  further  pair  of  events,  and  let  p„+1  denote  the  bet 
price  for  given  F„+l.  Then,  given  that  p  is  coherent,  a  necessary  and  sufficient  condition 
for  (p,p„+i)  to  also  be  coherent  is  that  either  pn+|-Pw(£,+1|£,+|)  or  else  Pw(F„+1)-0  for  some 
w  in  fV(p).  In  the  latter  case,  p„+\  may  coherently  assume  any  value  whatever.  In  the  former 
case,  P„(£„+||F,,.».|)  is  a  continuous,  bounded  function  defined  everywhere  in  IK(p).  and  hence 
achieves  a  minimum  and  maximum  (denoted  1  and  p„++1 ,  respectively)  on  this  set,  as  well 
as  all  values  in  between.  Thus,  if  Pw(£„+|)  >0  for  all  w  in  (p) ,  then  (p,p,+1)  is  coherent  if- 
and-only-if  aT+i  <  i  ■  This  is  de  Finetti’s  "fundamental  theorem  of  probability' 

(1974,  p.  112).  In  fact,  p„“*\  and  p*+\  are  lower  and  upper  conditional  probabilities  for  £„+) 
given  £„+ 1,  indirectly  determined  by  p,  in  the  sense  that  they  represent  the  lowest  selling  price 
and  the  highest  buying  price  for  a  unit  bet  on  £„+i  given  £,+ 1  which  would  be  consistent  with 
p.  This  notion  will  be  developed  further  in  Section  4. 

In  order  to  be  strictly  coherent,  a  set  of  bet  prices  must  not  only  obey  the  probability  laws, 
but  also  be  consistent  with  some  assignment  of  positive  probability  to  every  outcome  in  the 
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sample  space.  In  the  sense  of  the  preceding  discussion,  the  bookie  has  implicitly  assigned  zero 
probability  to  outcome  j  (unconditionally)  if  that  outcome  has  zero  probability  under  every 
probability  measure  which  yields  the  bet  prices  as  conditional  probabilities-  i.e.,  if  wy— 0  in 
every  non-negative  solution  to  wA“0,  where  A  is  the  matrix  defined  above.  The  implication 
of  Theorem  2  is  that  there  exists  a  combination  of  bets  for  which  the  bookie  will  lose  positive 
amounts  of  money  under  all  those  outcomes  (and  only  those  outcomes)  which  he  has  implicitly 
assigned  zero  probability,  while  winning  nothing  under  the  remaining  outcomes.  Either  coher¬ 
ence  or  strict  coherence  can  be  used  as  the  criterion  for  defining  and  measuring  subjective  pro¬ 
bability.  As  Buehler  (1976,  p.  1057)  points  out,  'the  philosophical  choice  between  the  two  cri¬ 
teria  is  clearly  linked  to  one's  attitude  toward  the  acceptability  of  subjective  probabilities  which 
equal  zero.’  De  F  .atti  (1974)  argues  that  zero  probabilities  are  necessary  in  order  to  deal  with 
infinite  partitions,  and  hence  favors  the  weaker  criterion,  coherence.  However,  in  any  physi¬ 
cally  realizable,  which  is  to  say  finite,  experiment,  it  appears  that  strict  coherence  would  be 
more  in  accord  with  ordinary  standards  of  behavior.  This  issue  will  be  illuminated  further  by 
the  corresponding  distinction  between  admissible  and  strictly  admissible  choices  under  scoring 
rules  in  a  later  section.  Throughout  the  remainder  of  this  paper,  the  term  ’bet  prices'  will  be 
used  to  denote  conditional  or  unconditional  subjective  probabilities  elicited  under  the  betting 
system  described  above,  in  order  to  avoid  confusion  with  probabilities  defined,  elicited,  or 
derived  in  other  ways. 

The  cond^ons  under  which  a  set  of  bet  prices  is  coherent,  as  given  in  Theorem  2,  have  a 
simple  geometric  interpretation.  A  probability  distribution  w  can  be  represented  as  a  vector  in 

m 

m -space  lying  in  the  standard  simplex  defined  by  £  w;-l,  w^O.  The  bet  prices  p  are  coherent 

./-i 

if-and-only-if  there  exists  a  distribution  w  satisfying  the  system  of  equations  w  A-fl,  which  will 
be  referred  to  as  the  'bet  price  constraints.’  Let  a,  denote  the  #,h  column  vector  of  A-i.e..  the 
vector  whose  j'H  element  is  Then  the  bet  price  constraints  can  be  written  as 

w'a,-0,  /— 1 . n.  Geometrically,  a,  is  the  normal  vector  of  a  hyperplane  passing  through 


the  origin  whose  intersection  with  the  simplex  is  the  set  of  all  w  which  are  consistent  with  the 


bet  price  p,.  This  is  illustrated  in  Figure  2.1  for  the  case  of  a  sample  space  consisting  of  only 
three  elements,  0-{0|,02.0j).  Here  the  simplex  of  probability  distributions  is  the  triangle 
whose  vertices  are  the  unit  vectors,  each  vertex  being  identified  with  one  element  of  6.  The 
hyperplane  normal  to  the  vector  pictured  is  defined  by  the  origin  and  the  two  points  on  the 
boundary  of  the  simplex  labelled  A  and  B.  The  tine  segment  AB  represents  the  set  of  probabil¬ 
ity  distributions  consistent  with  p,,  i.e.,  those  w  for  which  Pw(£, !/■,)—/>,.  Incoherence  arises 
when  there  is  no  point  in  the  simplex  at  which  all  n  bet  price  hyperplanes  intersect. 

Some  examples  of  coherence  and  incoherence  for  three-element  sample  spaces  are  illus- 

m 

trated  in  Figures  2.2,  2.3,  and  2.4.  These  figures  are  drawn  in  the  plane  defined  by  £w,-l,  in 

y- 1 

which  the  simplex  appears  as  an  equilateral  triangle.  If  this  triangle  is  scaled  so  that  its  height  is 
unity,  then  the  probability  distribution  (w,,W],W})  corresponding  to  any  point  is  determined  by 
letting  wj  equal  the  perpendicular  distance  from  that  point  to  the  side  opposite  the  vertex 
corresponding  to  0;.  Where  convenient,  the  notation  p(E\F)  and  p(E)  will  also  be  used  to 
denote  conditional  and  unconditional  bet  prices  for  the  events  parenthesized.  Figures  2.2a  and 
2.2b  represent  the  simple  case  of  a  complete  partition  of  the  sample  space,  in  which  m— n—3, 
and  p,-p(0,)  for  all  i.  In  this  case  the  bet  price  constraints  reduce  to  w-p,  which,  together 
with  the  coherence  requirement  that  this  must  be  satisfied  for  some  w  in  the  simplex,  implies 
Pi+P2+P3“l-  (This  is  the  "total  probability  theorem.’)  This  condition  is  satisfied  in  2.2a  by 
p-(.4,.3,.3),  and  violated  in  2.2b  by  p—  (.6,.3,.3).  For  each  /,  the  set  of  w  for  which  w,->p,  is  a 
line  parallel  to  the  face  opposite  the  vertex  corresponding  to  0,.  The  three  lines  so  determined 
by  p  have  a  point  of  mutual  intersection  (inside  the  simplex)  in  2.2a,  whereas  in  2.2b  they  do 
not.  The  case  of  an  incomplete  partition  is  illustrated  in  Figures  2.3a,  2.3b,  and  2.3c,  where 
m— 3,  2,  and  p,-p(0,)  for  /— 1,2.  Here  the  bet  price  constraints  on  w  are  and  W2— 

which  is  satisfied  by  some  w  in  [the  interior  of]  the  simplex  only  if  pi+p2  <  1  (<  1).  This  rela¬ 
tion  is  satisfied  in  Figures  2.3a  and  2.3b  by  p-(.3..6)  and  p— (.4,.6),  respectively.  (Note  that 
the  latter  choice  is  not  strictly  coherent,  since  it  implies  p(03)-O,  and  the  corresponding  proba¬ 
bility  distribution  is  therefore  represented  by  a  point  on  the  boundary  of  the  simplex,  rather 


than  in  the  interior.)  Figure  2.3c  illustrates  the  choice  p—  (.6,.6),  which  is  incoherent,  in  this 
case  the  lines  representing  the  bet  price  constraints  have  a  point  of  intersection  in  the  plane, 
but  it  is  outside  the  simplex. 

Whereas  the  previous  examples  illustrate  the  additive  taw  of  probability,  the  next  exam* 
pies,  in  Figures  2.4a  and  2.4b,  illustrate  the  multiplicative  law.  For  some  events  £  and  F,  let 
Pi-p(£F),  pj-p(f),  and  p3-p(£|£).  The  (minimal)  relevant  sample  space  is  then 
®i-£F,  #2-<l-F),  (!-£)£.  Here  the  bet  price  constraints  are  wt“Pi,  H»2-l-p2,  and 

wi(pj-l)+Hr3p3«-0,  which  implies  - -p3  unless  w1+w3-0.  Now,  coherence  requires 

Wj+W3 

wj+wj+w^l,  whence  p^l-wj—wi+w^  and  also  w  >  0,  so  that  p2  ^  p\  >  0-i.e., 
p(F)  >  p(£F)  ^  0.  In  particular,  p(F)-0  implies  p(££)-> 0.  Otherwise,  if  pj— p(F)^0,  then 

the  identity  wi-(wt+w3)( — - — )  implies  Pi-P2Pj~i.e.,  p(£F)-p(£)p(£|F).  (This  is  the 

Wl+Wj 

'compound  probability  theorem.’)  The  first  two  bet  price  constraints  correspond  to  lines  parallel 
to  the  sides  opposite  the  vertices  for  9\  and  respectively,  and  the  third  constraint 
corresponds  to  a  line  passing  through  the  vertex  for  92  and  through  a  point  on  its  opposite  side. 
In  Figure  2.4a,  where  p-(.2,.5,.4)  satisfies  the  multiplicative  probability  law,  these  three  lines 
intersect  at  a  point  in  the  simplex,  whereas  in  Figure  2.4b,  where  p—  (.4,.5,.4)  violates  the  mul¬ 
tiplicative  law,  these  lines  have  no  point  of  common  intersection. 
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3.  Identifying  and  reconciling  incoherence  ria  linear  programming 

It  has  been  seen  that,  given  a  vector  p  of  bet  prices,  the  problem  of  finding  a  "sure-win" 
bet  (the  'bettor’s  problem')  and  the  alternative  problem  of  finding  a  corresponding  distribution 
on  the  sample  space  (the  'bookie’s  problem”)  involve  finding  solutions  to  systems  of  linear  ine* 
qualities.  These  are  textbook  linear  programming  problems,  and,  moreover,  the  role  of  the 
corollary  to  Theorem  1  (which  is  a  variant  of  Farka’s  Lemma)  in  the  proof  of  Theorem  2  sug¬ 
gests  that  the  bookie’s  and  bettor’s  problems  are  in  fact  dual  to  each  other.  This  linear  pro¬ 
gramming  application  does  not  appear,  however,  to  have  received  explicit  treatment  in  the 
literature,  perhaps  because  the  subject  of  coherence  has  generally  been  considered  to  be  of 
more  theoretical  than  practical  interest.  The  conventional  emphasis  has  instead  been  on 
caiibration--i.e.,  obtaining  subjective  probability  assessments  which  agree  with  observed  fre¬ 
quencies.  The  seminal  paper  of  Lindley,  Tversky,  and  Brown  (1979)  has  pointed  out  the 
relevance  of  coherence  considerations  in  improving  the  precision  of  probability  assessments  and 
in  combining  assessments  by  different  experts,  as  well  as  in  avoiding  mere  inconsistency.  Their 
approach  to  the  identification  and  reconciliation  of  incoherence  is  thoroughly  Bayesian,  and  uses 
a  'coherent  observer,'  equipped  with  a  prior  distribution  and  likelihood  function,  to  perform 
the  reconciliation.  In  this  section  several  geometrically-motivated  linear  programs  for  the 
identification  and  reconciliation  of  incoherence  will  be  discussed.  It  will  be  seen  that,  under 
certain  assumptions  and  conditions,  the  Bayesian  and  linear  programming  approaches  closely 
resemble  each  other. 

As  a  starting  point,  consider  the  following  linear  program  (which  will  be  called  LP1). 
which  can  be  used  to  distinguish  between  coherence,  strict  coherence,  and  incoherence: 


Primal:  maximize  yo 

(3.1a) 

n 

subject  to  yo  +  yt  +  £a,,r,  -  0  j»  1. . . 

i-i 

.  ,  m 

(3.1b) 

iPjyj  - 1 

/-i 

(3.1c) 

yj  >  0  y- 1 . m  . 

(3.  Id) 

A 
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Dual:  minimize  h»o  (3.2a) 

m 

subject  to  ^,<x,jwj  “  ®  '“1 . *  (3.2b) 

y-i 

f'Wj  -  1  (3.2c) 

y-i 

+  /SyWo  >  0  y-l. ....  m  .  (3.2d) 


The  parameters  of  this  linear  program  are  the  matrix  A,  whose  0',/)th  element  is 
a,j-(pi-E,t)F,j,  and  a  vector  0-(0i . 0„)  of  positive  weights.  It  wilt  be  convenient,  with 

m 

no  loss  of  generality,  to  assume  that  £/9y-l  and  to  consider  0  to  represent  the  bettor's  proba- 

y-t 

bility  distribution  on  the  sample  space.  In  the  primal  program  the  natural  variables  are 

z, . z„,  where  z,  represents  the  number  of  unit  bets  on  the  /‘h  event  pair  purchased  by  the 

bettor.  y\ . ym  are  non-negative  slack  variables,  and  yo  is  essentially  an  artificial  variable 

guaranteeing  the  existence  of  a  feasible  solution.  (A  starting  feasible  solution  is  y#-—  1,  yj— 1 
for  ;-l . m,  and  z,“0  for  /«1 . n.)  The  primal  program  can  be  interpreted  as  search¬ 

ing  for  a  combination  of  bets  which  maximizes  the  ratio  of  the  bettor's  minimum  payoff  to  his 

n  - 

expected  payoff,  provided  the  latter  quantity  can  be  made  positive.  Since  the  quantity  £a,yz, 

i-i 

represents  the  payoff  to  the  bookie  under  the  /h  outcome,  constraint  (3.1b)  implies  that  in 
every  feasible  solution  yo+yj  represents  the  payoff  to  the  bettor  under  the  same  outcome. 
Since  yt  is  constrained  to  be  non-negative,  y^  is  evidently  greater  than  or  equal  to  the  bettor’s 

m 

minimum  payoff.  The  bettor's  expected  payoff  is  then  given  by  Star  notation 

y-i 

(z,\  yj,  wj,  etc.)  will  be  used  to  denote  the  values  of  the  primal  and  dual  variables  in  an  optimal 
solution.  If  all  bets  have  zero  expectation  for  the  bettor  (a  special  case  of  strict  coherence) 
then  yo  will  be  -1  in  view  of  constraint  (3.1c).  In  ail  other  cases,  when  bets  with  non-zero 
expectation  are  possible,  y<j  will  be  greater  than  -1,  and  will  represent  the  minimum  payoff 
(implying  y/- 0  for  at  least  one  j^\)  for  some  bet  with  positive  expectation.  In  fact,  yo  will  be 
the  maximum  possible  minimum  payoff  among  all  bets  whose  expected  payoff  exceeds  the 
minimum  payoff  by  exactly  unity.  Moreover,  no  bet  with  the  same  expectation  as  that  of  the 
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optimal  primal  solution  can  have  a  larger  minimum  payoff,  since  otherwise  constraint  (3.  ic) 
would  not  be  tight,  and  such  a  bet  could  therefore  be  scaled  up  to  obtain  a  feasible  solution 
with  a  higher  objective  value,  contradicting  optimality.  Hence,  y<j  must  be  negative,  zero,  or 
positive  according  to  whether  the  bet  prices  are  strictly  coherent,  coherent,  or  incoherent,  by 
the  very  definitions  of  these  terms.  If  yo“0,  then  z‘-(zt*. ....  z„")  is  a  "can’t-lose”  bet;  and  if 
yo  >0,  then  a*  is  a  "sure-win"  bet.  A  special  case  arises  when  there  is  a  bet  for  which  the 
payoffs  are  identical  and  positive-i.e.,  the  minimum  payoff  equals  the  expected  payoff— in 
which  case  yo  is  not  only  positive,  but  infinite.  In  this  case,  the  components  of  the  ‘sure-win’ 
bet  can  be  found  in  the  column  of  the  simplex  tableau  corresponding  to  the  entering  variable 
which  produces  an  unbounded  increase  in  the  objective  function.  If  yo  is  negative,  then  the 

bet  a**,  defined  by  z,**“  for  all  /',  achieves  the  largest  possible  expected  payoff  among  all 
-yo 

bets  whose  minimum  payoff  is  greater  than  or  equal  to  -1. 

The  dual  program  can  be  interpreted  as  searching  for  a  probability  distribution  on  the 
sample  space  which  is  consistent  with  the  bet  prices  and  which  is  also  as  dose  as  possible,  in  a 
certain  sense,  to  the  bettor’s  distribution.  The  dual  natural  variables  consist  of  w0  together  with 
the  elements  of  the  vector  w-(**i . wm).  The  solution  of  the  dual  program  by  the  sim¬ 

plex  algorithm  can  be  visualized  in  the  m-dimensionai  space  in  which  w  is  represented,  along 
the  lines  of  the  geometric  interpretation  of  coherence  presented  in  the  last  section.  The  feasi- 

m 

ble  region  for  w  is  the  set  of  points  in  the  hyperplane  defined  by  £h>;-1  which  satisfy  the  bet  * 

y-i 

price  constraints  w'A-0.  (Note  that  the  feasible  region  may  contain  points  outside  the  simplex, 
i.e.,  which  do  not  also  satisfy  w  >  0.)  If  this  set  is  non-empty  (which  is  the  case  if-and-only-if 
the  objective  function  is  bounded  in  the  optimal  primal  solution)  then  the  solution  of  the  dual 
program  by  the  primal  simplex  algorithm  involves  starting  at  some  feasible  point  and  then  mov¬ 
ing  within  the  feasible  region  toward  (the  interior  of)  the  simplex  by  maximizing  the  weighted 
minimum  of  the  coordinates,  with  the  weights  being  the  reciprocals  of  the  bettor's  probabilities. 
This  is  seen  by  rewriting  the  dual  objective  (3.2a)  as  ‘maximize  -w0‘  and  rewriting  the  con- 


k 
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straint  (3. 2d)  as  "-w0  <  If  the  dual  program  terminates  at  a  point  in  the  interior  of  the 
Pj 

simplex  (i.e.  the  weighted  minimum  coordinate  is  positive,  and  hence  *o  is  negative),  then  a 
positive  probability  distribution  (namely  w*)  has  been  found  which  agrees  with  the  bet  prices, 
and  strict  coherence  has  been  established,  according  to  Theorem  2.  If  termination  occurs  at  a 
point  on  the  surface  of  the  simplex  (i.e.,  the  minimum  coordinate  is  zero,  and  so  is  wq),  then  a 
semi-positive  distribution  has  been  found,  establishing  coherence  but  not  strict  coherence. 
Finally,  if  termination  occurs  outside  the  simplex  (i.e.,  one  of  the  final  coordinates  is  negative, 
hence  w<j  is  positive),  then  no  appropriate  semi-positive  distribution  exists,  and  incoherence  is 
established.  Of  course,  by  the  duality  theorem  of  linear  programming,  so  that  LP1 

may  be  taken  as  a  constructive  proof  of  Theorem  2.  The  special  case  in  which  h'o’D'o”-). 
which  arose  in  the  primal  program  when  every  bet  had  zero  expectation  for  the  bettor,  is  seen 
in  the  dual  program  to  represent  the  case  in  which  the  bettor's  probability  distribution  is  con¬ 
sistent  with  the  bet  prices-  i.e.,  the  optimal  dual  solution  is  w/-0;,  j- 1 . m. 

If  a  set  of  bet  prices  is  found  to  be  incoherent  (e.g.,  by  solving  LP1).  then  presumably  it 
will  be  desired  to  revise  them  so  as  to  reconcile  the  incoherence.  Properly  considered,  this 
ought  to  involve  introspection  and  careful  reassessment  on  the  part  of  the  bookie.  However, 
insofar  as  the  constraints  imposed  by  coherence  may  be  too  numerous  or  subtle  to  keep  in 
mind  during  this  process,  it  might  be  useful  or  even  necessary  to  have  an  external  procedure 
for  identifying  coherent  sets  of  bet  prices  which  are  in  some  sense  ’close*  to  the  original 
incoherent  set,  in  order  to  help  the  bookie  explore  his  alternatives.  A  Bayesian  reconciliation 
scheme  has  been  presented  by  Lindley,  Tversky,  and  Brown  (1979).  In  their  ’internal 
approach,*  a  coherent  observer  is  introduced  who  considers  the  bookie’s  true,  coherent  bet 
prices  as  uncertain  parameters  to  be  estimated.  The  observer  has  a  (continuous)  prior  distribu¬ 
tion  for  these  parameters  and  a  likelihood  function  specifying  the  distribution  of  the  errors  in 
the  bookie's  stated  bet  prices  given  his  true  bet  prices.  The  posterior  distribution  is  then  com¬ 
puted  using  Bayes’  Theorem,  and  a  vector  of  revised,  coherent  prices  can  be  obtained  as  the 
posterior  expected  value  of  the  true  bet  ptfce  vector,  subject  to  the  constraints  imposed  by 
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coherence.  If  the  coherence  constraints  are  non-linear,  then  the  set  of  possible  coherent  vec¬ 
tors  may  be  non-convex.  in  which  case  the  posterior  mode,  rather  than  the  posterior  expected 
value,  must  be  used,  since  the  barycenter  of  a  distribution  of  mass  on  a  non-convex  set  may  be  i 

a  point  outside  the  set.  Unfortunately,  there  are  several  practical  difficulties  associated  with  the 
strict  Bayesian  approach,  namely  the  assessment  of  the  "core*  distributions  required  in  the  con¬ 
ditioning  process,  and  the  complexity  of  the  calculations. 

To  simplify  matters,  Lindley  et  al  suggest  a  least-squares  approach  to  finding  a  reconciled 
bet  price  vector,  which  is  consistent  with  the  assumption  of  a  flat  prior  distribution  and  nor¬ 
mally  distributed  errors.  If  the  errors  are  also  assumed  independent,  the  reconciliation  is 
obtained  by  minimizing  the  weighted  sum  of  scores: 

fr/to-*,)2  (3.3) 

<•1 

over  the  set  of  all  coherent  w,  where  *»(r j . r„)  is  a  vector  of  positive  weights  equal  to 

the  reciprocals  of  the  error  starnlard  deviations.  Since  the  elements  of  p  are  restricted  to  the 
unit  interval,  they  can  at  best  only  '<*  approximately  normally  distributed  with  respect  to  the 
‘true*  bet  prices.  Therefore,  it  may  be  appropriate  to  assume  that  some  transform  of  each  bet 
price,  say  F(p,),  is  normally  distributed,  and  then  minimize  the  sum  of  the  squared  differences 
of  the  transforms: 

irHF(Pi)-FU t,))2  .  (3.4) 

i-i 

In  particular,  the  log  odds  transform  is  recommended:  F(p)~log{^—2-).  Lindley  et  al  refer  to 

P 

this  transformation  as  the  ‘choice  of  metric’--  probability  metric,  log-odds  metric,  etc. --and  sug¬ 
gest  that  the  choice  of  metric  should  reflect  the  transformation  under  which  the  error  variance 
is  most  nearly  constant.  For  purposes  of  later  comparison,  note  that  the  sum  of  squares  (3.3) 
is  the  squared  distance  between  p  and  w,  following  a  linear  transformation  by  the  matrix 

diag (tj . r„),  using  the  l j  norm.  An  alternative  choice  of  metric  would  be  to  minimize 

the  distance  between  these  vectors  using  a  different  norm.  For  example,  in  the  /|  norm  the 
corresponding  distance  is 
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Lt.Ia— ir,l  (3.5) 

i-i 

and  in  the  /„  norm  the  distance  is 

max  r,  {p, — it,  1  .  (3.6) 

i 

Of  course,  a  vector  minimizing  one  of  these  distances  would  not  exactly  correspond  to  the 
mode  of  a  normal  posterior  distribution,  and  the  weights  r  would  not  be  interpretable  as 
reciprocal  variances,  but  simply  as  a  set  of  (subjectively  chosen)  confidence  or  precision  factors. 

The  various  minimizations  suggested  above  must  all  be  performed  subject  to  the  coher¬ 
ence  constraints  on  w,  which  consist  of  a  set  of  equalities  and  inequalities  of  the  form 

M*)-0l>0],  *-l . K  (3.7) 

where  the  functions  (6i . hK)  consist  of  sums  and  products  of  the  elements  of  #  and  con¬ 

stants,  representing  the  requirements  of  the  additive,  multiplicative,  and  convexity  laws  of  pro¬ 
bability.  They  are  essentially  implicit  functions  determined  by  the  equations 

yj/lyWy-O,  j'—l . u,  and  together  with  the  inequalities 

j  j 

Wj >0,y*l,  .  ...  m.  A  practical  problem  may  arise  if  the  multiplicative  probability  law  is 
involved,  in  which  case  some  of  the  constraint  functions  will  be  nonlinear.  The  resulting  con¬ 
straint  set  may  be  non-convex,  and  exact  global  minimization  by  systematic  nonlinear  program¬ 
ming  methods  may  therefore  be  difficult.  Moreover,  if  the  constraint  set  is  pathologically 
shaped,  the  implicit  assumption  of  a  flat  distribution  on  it  may  be  questionable. 

An  alternative  approach,  suggested  by  the  geometric  representation  of  incoherence 
emphasized  in  this  paper,  would  be  to  represent  the  reconciled  assessment  in  terms  of  a 

corresponding  distribution  on  the  sample  space-i.e.,  ir,  -  P„(£,|F,),  /-I . n-  and  then 

perform  the  minimization  over  w.  The  constraint  set  for  w  is  in  all  cases  a  convex  set,  namely 
the  standard  simplex  in  m -space.  The  practical  difficulties  with  this  approach  are  associated 
with  the  nature  of  the  resulting  objective  function,  since 

'Z(p,-E,j)F,jwJ 

*  - rMlr)  -  - ■'  p.w  ■ 
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Note  the  presence  in  the  denominator  of  the  term  P„(F,).  Unless  the  probabilities  being 
assessed  are  all  unconditional  (in  which  case  £,-©  and  P„(£,)-l  for  all  /),  the  quantity  (3.8)  is 
a  nonlinear  and  not  necessarily  convex  function  of  w.  Thus,  unfortunately,  in  changing  vari¬ 
ables  from  w  to  w  in  order  to  obtain  a  convex  constraint  set  for  performing  the  minimizations 
(3.3),  (3.S),  or  (3.6),  a  non-convex  objective  function  may  be  obtained,  which  again  may  be 
difficult  to  minimize  globally.  One  way  around  this  difficulty  is  to  simply  ignore  the  term  in  the 
denominator  of  (3.8),  and  concentrate  on  minimizing  an  appropriate  function  of  the  quantities 


w'a,  -  /» 1 . n ,  (3.9) 

j 

where  a,  again  denotes  the  /lh  column  vector  of  the  matrix  A.  This  seemingly  ad  hoc  lineari¬ 
zation  of  the  objective  function  has  a  significant  and  interesting  geometric  interpretation  in  the 
space  of  probability  distributions  on  the  sample  space,  for  the  quantity  w'a,  is  proportional  to 
the  Euclidean  distance  from  w  to  the  nearest  point  in  the  hyperplane  of  the  simplex  which 
satisfies  the  /‘h  bet  price  constraint.  To  show  this,  let  u,  (w)  denote  the  vector  which  minimizes 

||w-u||  subject  to  u'a,-0  and  £w,-l,  and  let  </,(w)*||w-u,(w)||.  Note  that  if  u,(w)  >  0, 

j 

then  u,  (w)  is  the  closest  distribution  to  w  (in  the  Euclidean  sense)  which  yields  p,  as  the  condi¬ 
tional  probability  for  E,  given  F,.  Necessarily,  ||w-u,(w)||  is  proportional  to  the  the  vector  «®, 

whose  /"  component  is  a°  -  a,j  -  — Ta,;,  which  is  obtained  by  projecting  a,  on  the  hyper- 

m  j 

plane  £^-0.  From  the  definition  of  d,  (w) ,  it  follows  that  b,(w)-w±  ~^-a,a.  Enforcing 
«,(w)'a,-0,  and  noting  that  a 'a "-a,  a",  yields: 


d,( w)  - 


I  w'a,  | 
ll«,0ll 


(3.10) 


Note  that  w'a,  is  the  expected  payoff  to  the  bookie  for  a  unit  bet  placed  on  £,  given  £,,  under 
the  distribution  w.  On  the  other  hand,  it  is  evident  from  the  expansion  on  the  RHS  of  (3.10) 
that  the  quantity  ||a,<’||,  which  is  a  sort  of  normalizing  factor,  is  sfm  times  the  standard  devia¬ 
tion  of  the  payoff  for  a  unit  bet  on  £,  given  F,  under  the  uniform  distribution.  It  is  readily 
shown  that,  if  for  ail  J  (i.e.,  if  p,  is  an  unconditional  bet  price  for  £,),  then  ||a,®|| 


-A _ a  — 
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is  independent  of  p,.  If,  however,  fy-0  for  at  least  one  j,  then  ||<*,°||J  is  a  convex  quadratic 
function  of  p,  which  is  minimized  when  p,  equals  the  ratio  of  the  number  of  outcomes  in  0  for 
which  E,F,—l  to  the  number  of  outcomes  for  which  l--i.e.,  the  value  for  the  conditional 
probability  of  £,  given  F,  which  is  obtained  under  the  uniform  distribution. 

In  the  space  in  which  distributions  on  0  are  represented,  the  distance  d,  (w)  appears  to  be 
reasonable  measure  of  the  ’error*  in  the  bet  price  p>  when  the  'true*  distribution  is  w.  Minimi¬ 
zation  on  the  simplex  of  an  appropriate  convex  function  of  these  distances  will  yield  a  distribu¬ 
tion  which  satisfies  an  heuristic  admissibility  criterion,  namely  that  no  other  distribution  exists 
which  is  uniformly  closer  to  all  the  hyperplanes  determined  by  the  bet  price  constraints.  In  the 
examples  of  incoherence  illustrated  in  the  previous  section,  the  sets  of  distributions  which  are 
admissible  in  this  sense  are  represented  by  the  shaded  areas  in  Figures  2.2b  and  2.4b,  and  the 
line  segment  AB  in  Figure  2.3c.  Let  y-(yi . y.)  be  vector  of  positive  weights  represent¬ 

ing  relative  confidence  or  precision  under  this  measure  of  ”bet  price  error,’  incorporating  the 
normalizing  factors  ||«*||,  /-l,  ....  n,  suggested  by  (3.10).  Then,  by  analogy  with  (3.5), 
(3.6),  and  (3.3),  some  possible  objective  functions  for  minimization  are: 

i»«,i  (3.1D 

or 

maxy,|w'a,|  (3.12) 

r 

or  else  the  quadratic  form 

£y,J(w'«,)J  ■  w  (AMA')w  (3.13) 

/•I 

where  M~diagiy} . yj).  (More  generally,  M  could  be  any  positive  definite  matrix.)  The 

minimization  on  the  simplex  of  either  (3.11)  or  (3.12)  is  a  straightforward  linear  program,  and 
(3.13)  is  a  quadratic  program  with  linear  constraints.  However,  (3.12)  appears  to  be  a  much 
more  suitable  objective  function  for  practical  application  than  (3.11).  The  close  relation 
between  (3.11)  and  the  constraint  £w,-l  suggests  that  the  solution  may  be  highly  sensitive  to 


relatively  small  changes  in  the  weights,  and  will  tend  to  be  an  extreme  point  of  the  admissible 
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region.  (In  particular,  for  the  case  of  an  incoherent  partition,  only  the  bet  price  with  the  largest 
weight  is  likely  to  be  revised.)  By  comparison,  it  appears  that  minimization  of  (3.12)  will  gen¬ 
erally  yield  an  interior  point  of  the  admissible  set--in  fact,  with  an  appropriate  choice  of 
weights,  any  admissible  point  can  be  reached-and  it  will  be  seen  to  yield  essentially  the  same 
solution  as  the  quadratic  minimization  (3.13)  in  the  case  of  a  partition. 

The  linear  program  representing  the  minimization  of  (3.12)  will  now  be  discussed  in  some 
detail.  First,  note  that  minimizing  {max  y,|w’a,j)  is  equivalent  to  finding  the  smallest  number 

l 

v  such  that 

— -  <  w’a,  4  — ,  /"I . n  (3.14) 

y,  y  i 

for  some  w  in  the  simplex.  For  added  flexibility,  let  each  weight  y,  be  replaced  by  a  pair  of 
possibly-unequal  positive  weights  y,+  and  y,-,  with  y,+  substituted  for  y,  on  the  left  and  y~ 
substituted  for  y,  on  the  right  in  (3.14).  This  allows  for  the  possibility  that,  in  seeking  an 
optimal  reconciled  bet  price  vector,  positive  and  negative  deviations  from  each  of  the  initial  bet 
prices  will  be  weighted  differently,  which  might  be  especially  desirable  for  bet  prices  very  near 
to  0  or  1.  The  corresponding  primal/duai  pair  of  linear  programs  is  then: 


Primal:  maximize  y0 

(3.15a) 

n 

subject  to  y0  +  ^,a,j(z*~z,~)  <  0 

j-l 

(3.15b) 

,-i  Y,  V, 

(3.15c) 

z*  >  0,  zf  ^  0,  /'-I.  .  . 

. .  n. 

(3.15d) 

Dual:  minimize  v 

(3.16a) 

subject  to  ~  <  To,,*.  <  — r 
y<  /-i  r- 

(3.16b) 

m 

2>,-i 

(3.16c) 

Wj  >  0,  y-1 . m. 

(3.16d) 

Note  that  in  the  corresponding  primal  program  the  unrestricted-sign  variable  z,  of  LPl  (the 
number  of  unit  bets  on  E,  given  F,)  has  been  replaced  by  a  pair  of  non-negative  variables, 
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and  r,",  representing  its  positive  and  negative  parts.  The  primal  objective  is  still  the  maximiza¬ 
tion  of  the  bettor's  minimum  payoff,  but  the  external  constraint  is  now  a  prescribed  value  for 
the  weighted  sum  of  the  numbers  of  unit  bets  bought  and  sold,  rather  than  a  prescribed  value 
for  the  difference  between  the  minimum  payoff  and  the  expected  payoff.  In  view  of  the  dual 
constraints  (3.16b),  the  optimal  objective  value  can  never  be  negative.  In  the  primal  program 
this  is  reflected  in  the  fact  that  the  bettor  can  always  satisfy  constraint  (3.1  Sc)  by  buying  and 
selling  equal  numbers  of  bets  on  any  event,  which  is  equivalent  to  not  betting,  since  the  buying 
and  selling  prices  are  equal. 

An  interesting  special  case  is  obtained  by  letting  y*  -  --  and  y~  -  —  for  every  /, 

(I— Pi/  Pi 

whence  (3.15c)  becomes 

X«l-p,>-'+  +  P,zf)  “  1-  (3.17) 

l  —  l 

Note  that  the  quantity  on  the  left  is  the  bettor's  a  priori  maximum  payoff--that  is,  the  amount 
he  would  receive  from  the  baokie  if  he  won  every  bet.  This  is  "prior"  to  an  analysis  of  the  logi¬ 
cal  dependencies  among  the  events,  which  might  show  certain  joint  outcomes  of  the  events  to 
be  impossible,  in  which  case  certain  combinations  of  bets  could  not  be  won.  This  constraint 
may  be  considered  to  describe  the  situation  in  which  the  bookie  has  finite  resources,  and  will 
only  accept  bets  up  to  the  amount  he  can  "cover"  by  separately  matching  each  bet  with  the 
amount  the  bettor  might  win.  This  weighting  also  has  an  interesting  interpretation  in  the  dual 
program  Recall  that 

Pi  -p*(£-l^)-  (3. 18) 

whence  the  differences  between  the  initial  bet  prices  and  the  conditional  probabilities  based  on 
the  distribution  w  have  the  same  signs  as,  and  are  approximately  proportional  to,  the  quantities 
w'a,,  whose  weighted  deviations  from  zero  are  minimized  in  the  dual  program.  Under  this 
weighting,  a  positive  deviation  from  zero  of  w'a,,  corresponding  to  a  positive  difference 

between  p,  and  P„(£,|f,),  is  weighted  in  proportion  to  —  -i.e.,  in  inverse  proportion  to  the 
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maximum  positive  difference  possible  (which  is  obtained  when  P„(£,  |  £,)—()).  A  corresponding 
effect  is  obtained  with  respect  to  negative  deviations.  Roughly  speaking,  initial  bet  prices  near 
zero  tend  to  be  revised  upward  rather  than  downward  in  the  reconciliation  process  using  these 
weights,  and  vice  versa  for  initial  bet  prices  near  unity.  The  reconciliation  process,  in  this  case, 

tends  to  pull  all  the  elements  of  p  toward  the  value  y,  insofar  as  this  can  be  done  coherently. 

The  reconciliation  scheme  of  LP2  resembles  a  Bayesian  approach  formulated  in  the  m- 
space  of  probability  distributions  on  the  sample  space  rather  than  the  n -space  of  bet  price  vec¬ 
tors.  In  fact,  for  the  simple  case  in  which  the  events  constitute  a  partition,  where  every 
coherent  bet  price  vector  is  also  a  distribution  on  the  sample  space.  LP2  (with  appropriate 
weights)  yields  the  same  reconciled  values  as  the  ’internal  approach'  of  Lindley  et  al 
under  the  ’probability  metric’-i.e.,  using  the  quadratic  minimization  (3.3),  which  is  also  the 
same  as  (3.13)  in  this  case.  Here  the  dual  objective  (maximum  weighted  deviation)  in  LP2  and 
the  weighted  sum  of  squared  deviations  in  (3.3)  are  both  minimized  when  their  respective 
weighted  deviations  are  all  equal-i.e.,  when  the  deviations  are  proportional  to  the  inverses  of 
the  corresponding  weights.  Let  y*  -  y~  -  r}  for  all  i.  Then,  letting  w’  denote  the  coherent 
bet  price  vector  which  minimizes  (3.3)  and  letting  Pw*  denote  the  probability  measure 
corresponding  to  the  optimal  solution  to  LP2,  we  have: 


i-i/v 


IT,*  -  p;(£,!f,)  -  A  + - T— ,  <-l . n.  (3.20) 

r}  It,’1 

im  i 

To  illustrate  the  application  of  LP2  to  an  actual  problem,  more  difficult  than  a  simple  partition, 
consider  the  following  example  of  an  incoherent  assessment  which  was  given  in  Lindley  et  al: 
p(H)-. 33,  p(C)-. 27,  p(D)-. 23,  p(tf)-.12,  p(H |A0— .41,  p(C|A0-.31,  and  p(D |AO-.28. 
Here  m-4,  n-7,  and  H,  C,  D,  and  N  form  a  partition  of  the  sample  space,  so  that  H,  C,  and 
D  also  form  a  partition  of  /V.  Both  the  additive  law  and  the  multiplicative  taw  are  violated- 
e.g.,  p(ff)+p(C)+p(0)+p(iV)?«  1,  p(H)i*p(H\N){\-p(N)),  etc..  Four  iterations  of  the 
simplex  algorithm  on  LP2,  with  all  weights  equal  (for  lack  of  further  information)  yields  the 
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following  reconciled  values:  Pj(f/)—.3428,  P»(C)*".2816,  Pj(Z))— .2428,  Pw(AD«».1328, 
P.*(tf|JV)-3953,  P«(C|;V)— .3247,  and  P,,*U>|JV)-.28. 

The  features  of  LPt  and  LP2  can  be  combined  in  a  single  program  by  incorporating  into 
LP1  the  primal  constraint  (3.15c)  from  LP2  in  Lagrange  form,  using  a  multiplier  X.  The  primal 
objective  then  becomes: 

*  z  * 

maximize  yo  -  X  Y(-“  +  ——)  (3.21) 

,-i  yf  yf 

and  the  bet  price  constraints  in  the  dual  program  become: 

-Xyf  <  2m<*uwj  ^  Ay",  /-I . n.  (3.22) 

j- 1 

Here  the  primal  program  describes  the  situation  in  which,  from  every  bet,  the  bookie  is  taking  a 
'cut*  which  is  proportional  to  X,  and  also  inversely  proportional  to  the  corresponding  weight 
(yf  or  yf).  That  is,  the  bookie  takes  relatively  larger  cuts  from  those  bets  for  which  his  bet 
prices  have  low  confidence  factors.  In  the  dual  program  X  represents  an  overall  factor  by  which 
all  of  the  bet  price  constraints  have  been  relaxed.  The  behavior  of  the  optimal  solution  can  be 
investigated  as  a  function  of  X  by  parametric  programming.  For  any  given  value  of  X.  the  sign 
of  the  optimal  objective  value  plays  the  same  role  as  in  LP1  in  determining  whether  a  probabil¬ 
ity  distribution  consistent  with  the  (relaxed)  consttv  ts  has  been  found,  or  whether  a  'scre¬ 
win'  bet  (taking  into  account  the  bookie's  cut)  has  been  found.  The  minimum  value  of  X  for 
which  a  probability  distribution  exists  (i.e.,  the  smallest  overall  cut  for  which  no  'sure-win”  bet 
exists),  is  X— v",  the  optimal  objective  value  that  would  be  obtained  in  LP2  using  the  same 
weights  and  bet  prices.  For  any  X>v‘,  the  optimal  solution  will  be  affected  by  both  0  and  y, 
and  the  reconciled  bet  prices  thus  determined  will  generally  differ  at  least  slightly  from  the  ori¬ 
ginal  bet  prices,  even  if  the  original  assessment  was  coherent.  In  this  case  the  elements  of  0 
are  analogous  to  the  parameters  of  a  prior  distribution  on  the  simplex  in  a  Bayesian  model,  and 
the  elements  of  y  are  analogous  to  parameters  of  a  likelihood  function.  Of  course,  the  linear 
program  should  not  be  applied  as  if  it  arose  from  a  true  Bayesian  model-thai  is.  subjectively 
assessing  0  and  y  once-and-for-all,  and  accepting  the  resulting  solution.  Instead,  it  appears 


suiubie  for  use  in  an  interactive  process  in  which  the  bookie  could  explore  his  'admissible  fron¬ 
tier*  of  coherent  alternatives,  adjusting  the  parameters  until  satisfied  with  the  solution.  The 
parameterization  of  the  linear  programs  described  here  is  simple  enough  for  illustrative  pur¬ 
poses  but  also  appears  flexible  enough  for  practical  application.  (Many  other  parameterizations 
are  possible,  of  course.)  Although  fi  and  y  do  not  correspond  exactly  to  parameters  of  prior  dis¬ 
tributions  or  likelihood  functions,  they  are  nonetheless  readily  interpretabie  in  terms  of  their 
effects  in  steering  the  optimal  solution  toward  a  specified  'prior'  distribution  and/or  yielding  a 
reconciliation  in  which  the  original  bet  prices  with  the  highest  'confidence*  or  precision  are 


revised  the  least. 


4.  Lower  and  upper  probabilities  for  the  unfair  bookie 

As  noted  earlier,  in  situations  involving  many  events  and  subtle  interdependencies,  it  may 
be  difficult  if  not  impossible  for  the  bookie  to  keep  in  mind  ail  the  constraints  of  coherence 
while  attempting  to  articulate  a  set  of  bet  prices  which  he  judges  'fair.*  In  such  cases  he  must 
either  derive  his  bet  prices  from  a  previously  assessed  probability  distribution  on  the  sample 
space,  or  else  obtain  the  help  of  an  external  agent  to  determine  whether  his  initial  subjective 
bet  prices  are  coherent  and  to  explore  nearby  coherent  alternatives.  Therefore,  it  may  be  ambi¬ 
guous  to  define  a  person’s  subjective  probabilities  for  a  set  of  events  as  his  introspectively- 
obtained  coherent  bet  prices  without  also  specifying  by  what  means  coherence  is  to  be  verified 
and  incoherence  reconciled,  if  necessary.  The  acknowledgement  that  an  incoherent  initial 
assessment  is  possible  not  only  implies  the  need  for  procedures  for  identifying  and  reconciling 
incoherence,  but  also  casts  some  doubt  on  the  validity  of  initial  assessments  which  are 
coherent,  since  they  may  be  coherent  only  fortuitously.  This  suggests  that  the  elicitation  pro¬ 
cedure  should  be  extended  in  order  to  obtain  additional  information  which  could  be  used  to 
revise  or  adjust  the  initial  assessment  regardless  of  whether  it  is  incoherent-in  particular,  infor¬ 
mation  concerning  the  relative  precision  or  confidence  attached  to  each  of  the  original  bet 
prices.  Enforcement  of  the  coherence  constraints  would  then  provide  a  basis  for  jointly 
improving  the  precision  of  the  separate  bet  prices.  This  could  provide  an  important  practical 
tool  for  improving  probability  assessments  for  certain  ’target''  events,  by  enabling  avai  lable 
subjective  information  concerning  other,  related  events  to  be  brought  to  bear  in  a  systematic 
way.  The  Bayesian  approach  to  this  problem  is  to  introduce  a  hierarchy  of  probabilities-i.e., 
probability  distributions  on  probabilities.  By  restricting  the  posterior  joint  distribution  of  the 
true  values  to  the  set  of  coherent  possibilities,  an  improvement  in  precision  is  manifested  in  the 
fact  that  the  variances  of  the  posterior  marginal  distributions  of  the  separate  probabilities  will 
generally  be  less  than  the  error  variances  of  the  initial  assessment,  even  if  a  fiat  prior  is 
assumed.  However,  the  parameters  or  hyperparameters  whose  values  must  be  elicited  to 
describe  these  distributions  may  be  difficult  to  interpret  subjectively.  In  the  simplified  least- 
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squares  reconciliation  method  of  Lindley,  Tversky,  and  Brown  (1979),  only  two  numbers  need 
to  be  elicited  for  each  event,  namely  the  initial  assessment  of  the  probability  and  the  variance 
of  its  error  with  respect  to  the  true  value.  An  example  is  presented  in  which  the  variances  were 
elicited  by  asking  the  subject  to  state  a  range  of  plausible  values  for  each  probability,  after 
which  'the  quoted  ranges  were  interpreted  as  multiples  of  standard  deviations...*  This  section 
will  discuss  a  conceptually  and  operationally  simpler  method  for  eliciting  and  utilizing  informa* 
tion  concerning  the  precision  of  subjective  probabilities,  in  which  the  bookie  is  asked  to  specify 
his  uncertainty  about  one  event  conditional  on  another  in  terms  of  two  numbers  which  are 
interpreted  as  his  buying  and  selling  prices  for  unit  bets,  when  the  two  prices  are  not  required 
to  be  equal.  This  is  the  notion  of  *1ower  and  upper  probabilities,*  which  was  axiomatized  by 
Koopman  (1940)  and  given  a  betting  interpretation  by  Smith  (1961).  (A  controversial  statisti¬ 
cal  model  was  also  presented  by  Dempster  (1968).)  Lower  and  upper  probabilities  have  not 
been  highly  popular  in  practice,  even  among  Bayesians  (see,  e.g.,  the  discussions  to  Smith 
(1961)  and  Dempster  (1968)),  partly  because  they  are  not  as  easily  manipulated  as  the  param¬ 
eters  of  hierarchical  models  by  conventional  analytical  techniques.  It  will  be  seen,  however, 
that  they  provide  a  basis  for  a  natural  generalization  of  the  coherence  theorem  of  Section  2,  and 
are  readily  incorporated  into  linear  programming  models  for  improving  precision  and  reconcil¬ 
ing  incoherence. 

For  the  same  n  pairs  of  events  and  same  sample  space  considered  throughout  this  paper, 
let  the  bookie  announce  his  buying  price,  p,~,  and  his  selling  price,  p*,  for  a  unit  bet  on  £,  con¬ 
ditional  on  £,,  for  every  /.  The  bettor  then  places  his  bets  by  choosing  a  non-negative  In- 
vector  (*+,z-),  where  z*  is  the  number  of  unit  bets  on  £,  given  F,  which  he  wishes  to  buy  (at 
price  p*),  and  z,~  is  the  number  he  wishes  to  sell  (at  price  p,~)  ■  That  is,  the  bettor  must  buy  at 
the  bookie's  selling  price,  and  vice  versa.  The  net  gain  to  the  bookie  for  the  /,h  event  pair  will 
then  be  equal  to  ((p*-E,)z*-{pr-E, ) z,~) F, ,  and  his  total  net  gain  under  the  /h  outcome  in 
the  sample  space  will  be 

ry(z+,z-;p*,p-)  -  £((p*-£7)r*-(p,--£,y)z.-)F.y  . 

/“I 


(4.1) 
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The  bettor  is  free  to  both  buy  and  sell  bets  on  the  same  event,  although  it  will  not  be  profitable 
for  him  to  do  so  if  p*>p,~-  Let  the  payoff  vector.  t(z+,z';p+,p~),  be  defined  as  the  /n-vector 
whose  /"  element  is  //(*+,z~;p+,p").  Let  the  bet  prices  be  defined,  as  before,  to  be  {strictly] 
coherent  if-and-only-if  there  does  not  exist  a  combination  of  bets  for  which  the  payoff  vector  is 
(semi-1  negative.  Then  the  following  generalized  version  of  Theorem  2  is  obtained: 

THEOREM  2’:  The  buying/selling  bet  prices  (p",p+)  are  [strictly]  coherent  if-and-only-if 
there  exists  a  [positive]  probability  distribution  w  on  0,  and  a  corresponding  probability 
measure  P,  on  ail  subsets  of  9,  such  that  for  every  /,  either  p,~  <  Pw  (£,]£,)  <  p*,  or 
else  P.(£,)-0. 

Proof:  Note  that  the  payoff  vector  is  given  by 

t(z*,z~;p*,p")  -  [A+|-A"](z*,z")  “  AV  -  A"z"  (4.2) 

where  K*  and  A-  are  the  matrices  whose  Cm)**1  elements  are  a*-(p*-E,j)Ft,  and 

a“-(p‘-£,7)F(/,  respectively.  By  applying  Theorem  1  to  the  matrix  [A+|A~]  it  follows 
that  either  there  exists  a  bet  vector  for  which  the  corresponding  payoff  vector  is  [semi-] 
negative,  or  else  there  exists  a  semi-positive  [positive]  vector  w  satisfying  w  [A+|A~]  £0. 
Expanding  this  vector  inequality  into  n  pairs  of  scalar  inequalities,  and  defining  the  proba¬ 
bility  measure  as  in  the  proof  of  Theorem  2,  completes  the  proof. 

A  similar  result  is  proved,  somewhat  less  transparently,  by  Smith  (1961),  in  terms  of  odds  for 
bets  on  one  event  ’against’  another.  Based  on  this  theorem,  it  can  be  shown  that  coherent  buy¬ 
ing  and  selling  bet  prices  obey  the  laws  of  lower  and  upper  probabilities  given  as  axioms  by 
Koopman  (1940),  in  essentially  the  same  way  that  coherent  fair  bet  prices  were  shown  to  obey 
the  additive  and  multiplicative  laws  in  Section  2.  The  bookie's  buying  and  selling  prices  may  be 
considered  to  provide  partial  information,  in  the  form  of  lower  and  upper  bounds,  on  his  fair 
bet  prices.  Having  stated  a  willingness  to  buy  at  the  price  p,~,  he  would  presumably  also  buy  at 
a  lower  price  (if  possible),  and  he  might  even  buy  at  a  higher  price  (if  necessary),  but  he  could 
not  simultaneously  sell  at  any  lower  price  than  p,~  without  inviting  certain  loss.  Similarly,  his 
initially  stated  selling  price,  p,*,  represents  an  upper  bound  on  his  maximum  buying  price. 
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Thus,  his  fair  bet  prices  presumably  satisfy  p,~  <  p,  <  p*  for  ail  /.  Theorem  2'  states  that  his 
buying  and  selling  prices  are  coherent  if-and-only-if  there  exists  such  a  set  of  coherent  fair  bet 
prices.  The  latter  quantities  are  called  ’medial  odds*  by  Smith  (1961). 

The  generality  of  different  buying  and  selling  prices  can  be  incorporated  into  the  linear 
programming  models  of  the  last  section  by  a  trivial  modification  in  which  each  parameter  a,s  in 
the  constraints  of  the  original  primal  problem  is  replaced  by  the  pair  of  parameters  a*  and  a,J 
defined  in  the  proof  above,  which  are  associated  with  the  positive  and  negative  parts  of  r,, 
respectively.  For  example,  in  LP1,  the  primal  constraint  (3.1b)  would  be  replaced  by: 


Pe  +  yj  +  £(“>>*<*-“ 'J2~)  *  •••»«.  (4-3a) 

<-i 

«+ >  0.  >  0.  f-1 . n.  (4.3b) 

The  dual  constraint  (3.2b)  would  correspondingly  be  replaced  by: 

£a,yW/>0,  (-1 . n,  (4.4a) 

Ja.jwy  <  0,  /— 1 . n.  (4.4b) 

>- 1 


In  the  geometric  interpretation  of  the  dual  program,  for  each  /,  pf  and  p*  determine  the  orien¬ 
tations  of  a  pair  of  hyperplanes  in  m-space  which  pass  through  the  origin  and  are  normal  to  the 
vectors  a,-  and  a*,  which  are  the  i*  column  vectors  of  the  matrices  A-  and  A*,  respectively. 
The  set  of  points  in  the  simplex  lying  on  or  ’below*  the  first  hyperplane  (i.e.,  satisfying 
w'a,~  <  0)  and  on  or  ’above*  the  second  (i.e.,  satisfying  w'a*  >  0)  is  the  set  of  distributions 
w  for  which  p,~  <  P„(£,  |F,)  <  p+,  or  else  Pw(F,)-0.  The  set  of  buying  and  selling  prices  is 
(strictly]  coherent  if-and-only-if  the  intersection  of  all  n  such  sets,  denoted  W'fp'.p*),  is  non¬ 
empty  (contains  an  interior  point  of  the  simplex].  If,  for  some  i,  F,  is  not  the  certain  event, 
then  the  intersection  of  the  two  hyperplanes  determined  by  pf  and  p*  contains  all  those  points 
on  the  boundary  of  the  simplex  for  which  Pw(F,)-0.  If  coherence  or  stria  coherence  is  esta¬ 
blished  by  solving  this  linear  program,  then  the  optimal  dual  solution  yields  a  distribution  w* 

which  determines  a  set  of  coherent  fair  bet  prices,  namely  ir,-Pj(£,|F,),  i-1 . n,  lying 

between  (or  equalling  one  of)  the  respective  buying  and  selling  prices.  In  particular,  w*  has  the 
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property  that  it  is  the  closest  such  distribution  to  the  ’bettor’s  distribution/  0,  in  the  sense  dis¬ 
cussed  earlier. 

A  similar  modification  of  LP2  can  be  used  to  reconcile  as  well  as  identify  incoherence. 
The  primal  constraint  (3.1Sb)  is  modified  as  in  (4.3a),  and  the  duai  constraint  (3.16b)  is 
replaced  by: 

I afa  >  “  (4.5a) 

y-l  y i 

Z*0wJ  <  (4.5b) 

y-l  Yi 

where  yf  is  now  interpreted  as  a  precision  factor  for  p,~  and  y?  is  the  corresponding  precision 
factor  for  p*.  Note  that  if  (p_,p+)  is  coherent  and  if  pT<P*  for  all  i,  the  optimal  objective 
value  may  be  negative.  In  the  primal  problem  the  interpretation  is  that  the  constraint  (3.15c) 
may  force  the  bettor  to  make  a  combination  of  bets  which  will  lose  money  for  him  under  some 
outcomes,  since  he  no  longer  has  the  option  of  not  betting  (i.e.,  he  can  no  longer  buy  and  sell 
at  the  same  price).  In  the  dual  problem,  the  interpretation  is  that  a  distribution  may  exist 
which  satisfies  all  the  original  bet  price  constraints  (4.4a,b)  with  strict  inequality-i.e.,  the 
set  H'(p-,p*)  has  an  interior  point. 

If  (p~.p+)  is  coherent,  then  a  joint  improvement  in  the  precision  (in  the  sense  of  a  nar¬ 
rowing  of  the  intervals  (p/.p/l.  »“l . n)  may  be  obtained  for  the  same  reason  that  a 

coherent  assignment  of  fair  bet  prices  may  place  non-trivial  upper  and  lower  bounds  on  the  pos¬ 
sible  coherent  values  for  a  fair  bet  price  on  some  further,  related  event.  That  is,  the  set 
W{p~,p+)  determines  upper  and  lower  bounds  on  fair  bet  prices  for  all  event  pairs  which  are 
subsets  of  the  same  sample  space,  which,  in  the  case  of  the  event  pairs  originally  considered, 
may  be  tighter  bounds  than  the  stated  buying  and  selling  prices.  (Recall  that  the  stated  buying 
price  is  interpretable  as  a  lower  bound  on  the  bookie’s  minimum  selling  price;  it  may  not  be  the 
greatest  lower  bound  implied  by  his  overall  assessment.)  The  improved  tower  and  upper 
bounds,  denoted  (p,“.p*l,  /—I . n,  can  accordingly  be  defined  as: 

pf  -  min  P.  (£,!/-,) 


(4.6a) 
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p*  -  max  P„(£,|f,)  (4.6b) 

W 

where  the  minimization  and  maximization  are  with  respect  to  all  w  in  W'(p~,p*).  Necessarily, 
p,~  <  p,~  <  p*  <  p*  for  all  /.  The  simplest  example  of  this  is  the  two-fold  partition,  for 
which  p'(E)  -  min  |p+(£),  l-p~(£)l,  p~(E)  -  max  |p"(£),  l-p*(£)|,  etc..  In  general, 
finding  the  improved  assessment  can  be  approached  as  a  problem  in  parametric  programming 
on  the  columns  of  the  constraint  matrix  of  the  modified  forms  of  LP1  or  LP2  described  above. 
For  example,  to  determine  p*,  let  the  term  alj—{p*-Elj)Flj  be  replaced  by  (pl*’-K-EIJ)Fl)  for 

y— 1 . m,  in  the  column  of  the  constraint  matrix  corresponding  to  the  variable  z*.  The 

resulting  linear  program  can  be  studied  parametrically  as  a  function  of  X,  and  the  improved 
value  for  p*  is  obtained  as  p*  -  p*  -  x\  where  X*  is  the  largest  value  of  X  for  which  the 
optimal  objective  value  is  not  greater  than  zero. 


5.  General  scoring  rales  and  their  probability  transforms 

As  an  alternative  to  betting  systems,  subjective  probability  can  be  defined  and  measured 
in  terms  of  marginal  rates  of  substitution,  through  the  use  of  scoring  rules.  A  scoring  rule  can 
be  represented  as  a  loss  function  of  two  arguments,  /(x,£),  where  £  may  be  either  0  or  1,  and 
x  is  a  real  number  whose  domain  is  usually  taken  to  be  the  unit  interval,  with  /(x, 0)  being 
strictly  increasing  and  /(x,l)  strictly  decreasing  in  x.  (That  is,  xt<x2  implies 
/(xi,0)</(x2,0)  and  /(x1,l)>/(x2,l).)  A  person’s  subjective  probability  for  £  can  be 
defined  in  terms  of  the  value  for  x  which  he  would  choose  under  the  condition  of  receiving  a 
loss  of  /(x,£).  He  will  presumably  adjust  his  choice  for  x  until  he  finds  a  point  at  which  the 
value  for  him  of  the  marginal  decrease  in  his  loss  (score)  under  one  outcome  due  to  further 
changes  in  x  is  exactly  balanced  by  the  value  of  the  marginal  increase  under  the  other  outcome. 
This  approach  can  be  generalized  for  eliciting  conditional  probabilities  in  a  manner  analogous  to 
"called  off  bets,"  by  letting  the  loss  for  £  conditional  on  £  be  given  by  /(x,£)£.  so  that  the 
loss  is  zero  if  £-0  obtains,  regardless  of  the  value  of  £. 

A  scoring  rule  is  called  proper  if  its  x -domain  is  the  unit  interval  and  it  has  the  property 
that  a  person  minimizes  his  expected  loss  by  choosing  x—p  when  his  "true"  subjective  probabil¬ 
ity  for  £  is  p.  The  prototype  proper  scoring  rule  is  the  quadratic  rule,  /(x,£)->fr(£-x)2,  for 
some  constant  k.  This  scoring  rule  may  be  considered  simply  as  squared-error  loss  for  choos¬ 
ing  x  to  "predict"  the  value  of  £.  The  quadratic  rule  has  been  used  by  de  Finetti  as  the  basis 
for  much  of  his  theory  of  subjective  probability,  and  also  (in  more  general  forms)  has  a  long 
history  of  practical  application  in  meteorology  as  a  method  of  evaluating  forecasts  (e.g..  Brier 
(19S0),  Stael  von  Holstein  and  Murphy  (1978)).  Another  well-known  proper  scoring  rule  is 
the  symmetric  logarithmic  rule,  /(x,l)  -  /(l-x,0)  *■  -lr(log(x)),  whose  use  was  recom¬ 
mended  by  Good  (19S2)  for  the  reason  that,  with  the  inclusion  of  an  appropriate  additive  con¬ 
stant,  the  expected  reward  (negative  loss)  to  the  probability  assessor  is  proportional  to  the 
amount  of  information  (according  to  Shannon’s  negative-entropy  definition)  contained  in  his 
assessment.  The  same  effect  can  be  obtained  with  respect  to  an  assignment  of  probabilities  to  a 


general  partition  by  using  the  asymmetric  logarithmic  rule:  /Or, OH fcx,  /Oc,l)— HlogOc)), 
which  is  also  a  proper  scoring  rule.  A  detailed  discussion  of  the  properties  and  uses  of  proper 
scoring  rules  has  been  given  by  Savage  (1971). 

As  a  basis  for  defining  and  measuring  subjective  probability,  scoring  rules  have  an  advan¬ 
tage  over  betting  systems  in  that  no  intelligent  antagonist  is  involved—  a  person’s  net  loss  is 
determined  only  by  the  value  he  chooses  for  his  probability  and  by  the  state  of  nature  which 
obtains.  When  probabilities  are  elicited  simultaneously  for  a  finite  number  of  different  events 
on  the  condition  that  the  total  score  will  be  the  sum  of  the  separate  scores,  the  requirement  of 
admissibility  -that  unnecessary  certain  loss  must  be  avoided-  can  be  used  to  establish  the  same 
probability  laws  (derived  from  the  existence  of  an  underlying  probability  measure)  that  were 
established  for  bet  prices  based  on  the  coherence  requirement.  De  Finetti  (1972,  1974)  proves 
this  result  for  the  quadratic  scoring  rule  by  a  series  of  geometric  arguments  in  which  the  score 
plays  the  role  of  squared  Euclidean  distance.  Recently  Lindley  (1980)  has  explored  the  proper¬ 
ties  of  scales  of  ’subjective  conditional  uncertainty,’  operationally  defined  in  terms  of  general¬ 
ized  scoring  rules  satisfying  only  certain  modest  regularity  requirements  and  whose  x -domains 
are  allowed  to  be  arbitrary  intervals  of  the  real  line.  In  a  series  of  arguments  based  on  deter¬ 
minants  of  the  matrix  of  scoring  function  derivatives,  somewhat  parallel  to  de  Finetti’s  proof  of 
the  coherence  theorem,  Lindley  shows  that  admissible  sets  of  uncertainty  values  elicited  under 
a  generalized  scoring  rule  can  be  transformed  into  numbers  in  the  unit  interval  which  must 
obey  the  laws  of  probability  for  their  respective  events.  In  this  section  and  the  next,  using 
appropriate  definitions  of  admissibility  and  strict  admissibility,  a  stronger  version  of  Lindley’s 
results  will  be  proved  by  exploiting  the  equivalence  of  bet  prices  and  choices  under  scoring 
rules,  then  invoking  the  results  of  the  previous  sections. 

Let  a  scale  of  subjective  conditional  uncertainty  be  operationally  defined  on  an  interval 
U/,jrr],  which  might  be  finite,  semi-infinite,  or  infinite  in  extent,  using  a  generalized  scoring 
rule  /0r,£)  satisfying  the  following  regularity  assumptions: 


(Al)  /(■,<))  and  /(•,  1)  are  continuous  and  bounded  below,  and  are  strictly  increasing  and 
decreasing,  respectively,  on  the  interval  U/,jcr]; 

(A2)  /(-,0)  and  /(-,1)  have  continuous  first  derivatives  on  {xf,xrl,  denoted  /'(*,0)  and 
/'(•,1),  respectively; 

(A3)  /'(•, 0)  and  /'(\l)  are  not  both  zero  or  both  infinite  in  magnitude  at  any  point  in 
the  open  interval  (x/,xr); 

<A4)  S  -  “• 

lim  {.'.j*'}.]  -0. 
x—xT  f  (x,0) 


These  assumptions  are  similar,  but  not  quite  identical,  to  those  given  by  Lindley  (1980). 
Lindley’s  motivation  for  considering  this  generalization  of  the  scoring-rule  concept  was  to 
determine  whether  any  method  for  describing  uncertainty  about  an  event  or  hypothesis  which 
did  not  implicitly  obey  the  laws  of  probability  (e.g.,  confidence  levels,  fuzzy  logic)  could  be 
given  a  subjectivistic  basis  in  terms  of  a  pair  of  loss  functions.  (The  subsequent  admissibility 
analysis  shows  this  to  be  impossible--i.e.,  probability  is  the  only  sensible  description  of  subjec¬ 
tive  uncertainty.  In  retrospect,  this  is  not  surprising,  since  choosing  uncertainty  values  under  a 
scoring  rule  is  a  special  kind  of  "S-game  against  nature,*  for  which  there  is  a  well-known  rela¬ 
tion  between  admissible  strategies  and  Bayes  strategies  (Blackwell  and  Girshick  (1954)).)  One 
desirable  property  for  a  general  uncertainty  scale  is  monotonicity-that  is,  the  right  and  left  end¬ 
points  of  the  scale  should  correspond  to  logical  (i.e,  certain)  truth  and  falsehood,  respectively 
(hence  the  superscripts  *T*  and  *F),  and  for  intermediate  values  the  indicated  degree  of  cer¬ 
tainty  (as  to  the  event  being  true)  should  increase  monotonically  from  left  to  right.  To  imple¬ 
ment  this  notion,  it  appears  reasonable  to  assume  that  the  assigned  loss  should  be  an  increasing 
function  of  x  if  the  event  turns  out  to  be  false,  and  a  decreasing  function  of  x  if  it  turns  out  to 
be  true.  Al  is  a  formalization  of  this  assumption.  Another  desirable  property  is  smoothness, 
which  is  provided  by  the  differentiability  assumption,  A2.  A3  guarantees  regularity  in  the  sense 
that  the  losses  associated  with  neighboring  points  must  be  distinct  by  a  first-order  amount  under 


1  . 
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at  least  one  outcome,  ruling  out  certain  kinds  of  degeneracy.  A4  ensures  that  the  scoring  rule 
is  well-behaved,  in  a  sense  which  will  be  made  clear  below,  at  the  endpoints  of  [xf,xrl.  (A 
relaxation  of  A2  and  A4  will  be  mentioned  at  the  end  of  the  next  section.) 

After  a  person  reveals  his  subjective  conditional  uncertainty  about  £  given  F  by  the 
number  x  he  chooses  subject  to  a  loss  of  f(x,E)F ,  a  unique  number  in  the  unit  interval,  sug¬ 
gestive  of  a  conditional  probability,  can  immediately  be  associated  with  x  by  the  marginal-rate- 
of-substitution  argument  sketched  above,  (it  will  be  shown  later  that  the  numbers  so  deter¬ 
mined  must  indeed  obey  the  laws  of  probability  if  unnecessary  certain  loss  is  to  be  avoided.)  In 
the  vicinity  of  x,  a  change  of  Ax  leads  to  a  gain  of  approximately  — /'(x,l)Ax  if  EF—l  and  a 
loss  of  approximately  /'(x,0)Ax  if  (I— £)£-!.  (Note  that,  by  Al,  /'(x,l)  <  0  and 
/'(x, 0)  >  0  for  all  x  in  [x/,xr].)  If  it  is  assumed  that  the  person  is  indifferent  between  x  and 
x+Ax,  regardless  of  whether  Ax  is  positive  or  negative  (provided  it  is  sufficiently  small),  then 
his  conditional  subjective  probability  for  E  given  F,  denoted  p,  evidently  satisfies 
p(-/'(x,l))-(l-p)/'(x,0),  leading  to  the  equation  p-P(x).  where  P(x)  is  the  "probability 
transform  of  x"  as  defined  by  Lindley  (1980): 


P(x)  =  /  - 

/'(x,0)-/  (x,l)  ’ 


(5.1) 


The  person  who  chooses  x  to  denote  his  uncertainty  about  E  given  F,  under  the  scoring  rule 
/,  is  therefore  considered  to  be  like  the  bookie  who  will  accept  an  arbitrary  small  bet,  r  (either 
positive  or  negative-i.e.,  buying  or  selling),  at  price  p,  where  p~P(x)  and 
z— (/’(x,0)-/'(x,l))Ax. 

Another  way  to  interpret  the  probability  transform  is  to  note  that  if  the  person’s  "true" 
subjective  probability  for  £  given  F  is  p,  then  his  conditional  expected  score  due  to  the  choice 
x  is  given  by 


r(x,p)  -  p/(x,  1)  +  (l-p)/(x,0).  (5.2) 

A  Bayesian  would  wish  this  quantity  to  be  minimal,  and  a  necessary  condition  for  x  to  minim¬ 
ize  r(-,p)  is  that  r’(x,p)-0,  which  leads  to  p-£(x)  as  defined  above.  (Note:  "prime*  will  con¬ 
sistently  denote  differentiation  with  respect  to  the  first  argument.)  Thus,  x  is  a  stationary 
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point~either  a  local  minimum,  local  maximum,  or  inflection  point-of  r(\p),  when  p—P(x), 
by  the  definition  (3.1).  The  regularity  conditions  for  /  can  be  interpreted  as  ensuring  that  P  is 
a  continuous  function  which  satisfies  0  <  P(x)  <  1  on  Lx/,xr],  and  furthermore,  in  view  of 
(A4),  P(xF)-0  and  F(xr)-l.  In  fact,  (A4)  guarantees  that  r(-,p)  is  minimized  at  xF  only  for 
p— 0,  and  similarly  at  xT  for  p-1.  For  any  p>0,  r(\p)  is  minimized  at  some  x>xr,  although 
for  p  sufficiently  small  this  point  can  be  made  arbitrarily  close  to  xf ,  and  similarly  for  p<  1  in 
the  vicinity  of  xT. 

If  the  probability  transform  is  a  strictly  increasing  function  of  x,  then  every  x  in  [xF,xT\ 
is  the  unique  point  which  minimizes  r(-,p),  where  p-P(x).  A  sufficient  but  not  necessary 
condition  for  P(x)  to  be  strictly  increasing  is  for  f(\E)  to  be  strictly  convex  for  both  values  of 
£.  If  /(-,0)  and  /(•,  1)  have  continuous  second  derivatives  on  [xF ,xrl,  then  P  has  a  continu¬ 
ous  first  derivative,  given  by 

P'(X)  mm  /  (*<0)/  (x,l)— /'(x,l)/  (X,0)  2) 

(/'(x,0)-/'(x,l))2 

In  this  case  a  necessary  and  sufficient  condition  for  P  to  be  strictly  increasing  is  P'(x)  >  0 
almost  everywhere  on  [xF ,xT],  From  the  above  expression  this  condition  is  seen  to  be 
equivalent  to  /'(x,0)/"(x,l)— /'(x,l)/"(x,0)>0  a.e.  on  [xf.xr].  (Note  that  this  condition  is 
weaker  than  strict  convexity-i.e.,  weaker  than  requiring  /''(x.0)>0  and  /”(x,l)>0  a.e.  on 
lx',xr].)  Moreover,  the  conditional  expected  score  function,  r(x,p),  has  a  second  derivative 
with  respect  to  x  in  this  case,  and  it  is  easily  shown  that  r”(x,p)»/,'(x)(/’(x,0)-/'(x,l)), 
where  p-/*(x).  By  assumption  A3,  /'(x,0)-/’(x,l)>0  on  (xr,xT),  so  that  r"(x,p)  has  the 
same  sign  as  P'(x).  Hence,  r{-,p)  is  locally  convex  at  x  if  P  is  increasing  at  x,  which  implies 
that  x  is  at  least  a  local  (if  not  global)  minimum  of  r(-,p)  for  p— F(x);  and  conversely,  if  P  is 
decreasing  at  x,  then  x  is  a  local  maximum  of  r(-,p).  This  suggests  that,  if  P  is  not  strictly 
increasing  on  [xf,xr],  it  would  be  sensible,  in  choosing  x,  to  restrict  attention  to  those  values 
in  whose  vicinity  P  is  increasing,  insofar  as  a  person  who  wished  to  minimize  his  expected 
score  for  any  given  probability  would  not  do  otherwise. 
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A  proper  scoring  rule  is  a  special  case  of  a  of  scoring  rule  with  a  strictly  increasing  proba¬ 
bility  transform,  for  which  x/-0,  xr-l,  and  P(x)-x  on  10,11.  Conversely,  every  scoring  rule 
with  a  strictly  increasing  probability  transform  can  be  converted  into  a  proper  scoring  rule  by  an 
appropriate  transformation  of  the  x-axis~in  particular,  by  the  one-to-one  transformation  of 
[xf  ,xT]  onto  [0,11  defined  by  P.  That  is,  if  /  is  a  scoring  rule  whose  probability  transform,  P, 
is  strictly  increasing  (and  hence  invertible)  on  [jr/,jr7’],  then  an  associated  proper  scoring  rule, 
denoted  /*,  can  be  defined  according  to  /"(x,£)  —  f(P~Hx),E).  Thus  scoring  rules  with 
strictly  increasing  probability  transforms  appear  to  be  a  natural  generalization  of  proper  scoring 
rules,  a  notion  which  will  be  made  more  concrete  in  the  next  section. 

An  interesting,  if  somewhat  pathological,  example  of  a  scoring  rule  whose  probability 
transform  need  aot  be  strictly  increasing  is  the  trigonometric  scoring  rule  with  frequency 
parameter  Ac  (a  non-negative  integer),  given  by: 

f(x,0,k)  -  /(l-x,l;*)  -  x  -  "*n(^11)^rX  (5.4) 

with  xf-0  and  xr-l.  Note  that  /(x,0;Ac)  and  /(x,l;Ac)  are  reflections  of  each  other  in  the 

line  x-y,  and  also  /(x,0;Ac)-/(x,l;Ac)+2x-l,  whence  /’(x,0;Ac)  -  /'(x,l;Ac)  -  2,  for  all  x. 

The  corresponding  probability  transform  is  thus  given  by 

P(x;k)  -  y/'(x,0;Ac)  -  y(l-cos(2Ac+l)wx) .  (5.5) 

The  trigonometric  scoring  functions,  /(x,0;Ac)  and  /(x,l;Ac),  and  their  probability  transform, 
P(x\k),  are  plotted  in  Figures  (5.1a)  and  (5.1b)  for  Ac-0,  and  in  Figures  (5.2a)  and  (5.2b)  for 
Ac-1.  Note  that  the  graph  of  the  probability  transform  has  the  shape  of  a  raised,  inverted 

cosine  wave  which  executes  Ac+y  cycles  in  the  unit  interval.  For  Ac-0  the  probability 

transform  is  strictly  increasing  and  "nearly  proper,"  i.e.,  its  graph  is  dose  to  the  line  y—x.  In 
fact,  the  scoring  functions  closely  resemble  those  of  the  quadratic  rule  in  this  case.  For  Ac-1, 

however,  the  probability  transform  increases  monotonically  from  0  to  1  on  the  interval  [O.y], 
then  decreases  monotonically  from  1  back  to  0  on  [y,y],  and  finally  increases  monotonically 
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from  0  to  1  again  on  [y,ll.  Thus,  the  equation  P(x;l)-p  has  two  distinct  solutions  in  x  for 
p-0  or  p-l,  and  three  distinct  solutions  in  x  for  ail  intermediate  values  of  p.  in  particular,  if 
x  lies  in  the  interval  [0,yl  and  is  a  solution  to  P(x;l)-/>  for  some  p,  then  y  -  x  and  y  +  x 

are  also  solutions.  Moreover,  if  0<p<l  and  p* y,  then  the  solution  to  P(x\\)—p  which  lies 

either  in  (0,— )  or  in  (-f-,1)  is  the  unique  global  minimum  of  the  conditional  expected  score 
6  o 

11  2  5 

function;  the  solution  which  lies  either  in  (— . — )  or  in  )  is  a  local  but  not  global 

0  3  3  0 

minimum;  and  the  solution  which  lies  in  (y,y)  is  a  local  maximum.  (Note  that  this  is  con¬ 
sistent  with  the  earlier  observation  that  if  /*(x)— p,  then  x  is  a  local  minimum  [maximum!  of 
r(-,p)  if  P  is  increasing  (decreasing]  at  jc.)  The  equation  />(jr;l)— y  has  the  solutions 

x—-j~,  and  jr— the  first  two  of  which  are  both  global  minima  of  and  the  last  of 

which  is  the  global  maximum.  r(-,p;l)  is  plotted  in  Figures  (5.3a)  and  (5.3b)  for  and 
p-y,  respectively. 


r 

[• 


Fig.  5.3a 


Fig.  5.3b 
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f.  Admissibility  under  generalized  scoring  rules 

For  the  seme  n  event  pairs  and  seme  semple  spece  considered  earlier,  let  a  person  reveal 
his  subjective  conditional  uncertainty  about  £,  given  F,  by  choosing  a  number  x,  in  an  interval 
btfjc?)  under  a  generalized  scoring  rule  /,  satisfying  the  regularity  assumptions  given  in  the 

last  section,  for  /—I . it.  (A  different  scoring  rule  may  be  used  for  each  event  pair.  The 

corresponding  probability  transforms  and  conditional  expected  score  functions  will  be  denoted 
P,,  r„  etc..)  It  is  assumed  that  the  person’s  total  score  (loss)  under  the  /"  outcome  in  the  sam¬ 
ple  space,  denoted  jy(x),  is  given  by  the  sum  of  the  scores  for  the  separate  event  pairs,  i.e., 

tj(x)  m  £/,(*„ £„)/;,  (6.1) 

.-i 

where  the  vector  x  is  used  to  represent  the  set  of  choices  (x|. . . .  ,x„).  The  scon  vector  can 
now  be  defined  as  the  « -vector  s(x)  whose  /”  element  is  sy(x). 

DEFINITION:  The  vector  of  choices  x  is  admissible  if  there  does  not  exist  any  other  vec¬ 
tor  y  for  which  s(y)-s(x)<0. 

This  definition  follows  de  Finetti  (1972).  Admissible  choices  under  scoring  rules  are  analogous 
to  coherent  bet  prices  in  avoiding  unnecessary  uniform  loss  under  all  outcomes  in  the  sample 
space;  however,  the  corresponding  notion  of  strict  admissibility  is  not  obtained  merely  by  sub¬ 
stituting  ’<*  for  *<*  in  the  above  definition,  for  reasons  which  will  become  apparent.  Instead, 
the  following  is  required: 

DEFINITION:  The  vector  of  choices  x  is  strictly  admissible  if  there  exists  some  *>0  for 
which 

max  (sy(y)-sy(x)J  >  «  max  [sy(x)-sy(y)J 
for  all  other  vectors  y. 

In  other  words,  a  choice  vector  is  admissible  if  there  is  no  alternative  choice  yielding  a  lower 
score  under  every  outcome,  and  strictly  admissible  if,  for  some  «,  every  alternative  choice 
which  lowers  the  score  by,  say,  Is  under  one  outcome,  raises  the  score  by  at  least  «<!.'  under 
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some  other  outcome.  Strictly  admissible  choices  are  analogous  to  strictly  coherent  bet  prices  in 
that,  relative  to  alternative  choices,  they  do  not  admit  a  loss  under  one  outcome  without  a  pro¬ 
portional  gain  under  some  other  outcome.  That  is,  a  person  who  adheres  to  strict  admissibility 
will  not  accept  the  chance  of  a  finite  [infinite]  loss  in  return  for  the  chance  of  an  infinitesimal 
[finite]  gain. 

THEOREM  3:  A  vector  of  choices  x-(xt . x„)  is  [strictly]  admissible  only  if 

p Mpu  . . .  ,p„)  is  a  (strictly]  coherent  vector  of  bet  prices  for  the  same  events,  where 
Pi-P,(Xi),  i— 1 . n. 

Proof:  Suppose  p  is  not  [strictly]  coherent.  Then  there  exists  a  "sure- win"  ["can’t-lose"] 
bet  t-  i.e.,  for  which  t(z;p)<0  «0),  where  t(x;p)  is  the  payoff  vector  defined  in  Section 
2.  This  bet  vector,  together  with  a  small  positive  constant  8,  will  be  used  to  define  a  vec¬ 
tor  of  small  changes,  Ax  (8),  such  that  for  small  enough  8  the  existence  of  the  alternative 
choice  y-x+Ax(8)  will  contradict  the  assumption  of  [strict]  admissibility.  Let  Ax(8)  be 
defined  in  the  following  way:  if  /’(x„0)  and  /(x,,l)  are  both  non-zero  and  finite  in  mag¬ 
nitude,  let 

Ax*  (8)  -  -7 - - •  (6.2) 

/,(x„0)-/,(x„l) 

Then, 


/•(*,+ Ax((8),£)  -  /,(*„£)  -  8(p,-£)z,  +  o(8),  (6.3) 

for  both  values  of  £,  where  "little-o"  notation  is  used  to  denote  an  arbitrary  function 

satisfying 

4—0  8 

If  at  least  one  of  the  derivatives  is  zero  or  infinite,  then  either  /*(x,)— 0  or  £,(x,)-l. 
Suppose  that  £,(x,)-0.  Then,  if  /,(x,,I)-o®  (which  is  only  possible  if  x,-x,0,  let  Ax, (8) 
be  chosen  so  that  /, (x,+Ax,  (8),0)-/, (x,,0)— 82.  (This  is  possible,  for  small  enough  8, 
since  /,(x,0)  is  continuous  and  strictly  increasing.)  Note  that,  in  this  case,  for  any 
Ax,(8)>0,  /,(x,+Ax,(8),l)-/,(x,,I)  “  -«o,  so  that  an  infinite  decrease  in  the  score  is 


/ 


obtained  when  £,£-!.  If,  however,  /,(x,l)<°o,  then  let  Ax, (8)  be  chosen  so  that 
/,(*+Ajfj(8),!)— /i(Xj,l)“— 8z,.  (Here  it  may  be  assumed,  w.l.o.g.,  that  z,>0,  since  the 
bettor  thereby  obtains  at  least  a  "can’t-lose*  situation  for  the  <th  event  pair  when  p,-0.) 
Note  that  when  £,£,- 1,  the  decrease  in  score  due  to  Ax,  (8)  is 


Mxi,\)  -/i(xi+Ax,(8),l)  -  /  (-/’(x,l))d!x  -  8r,,  (6.4) 

whereas  the  increase  in  score  when  (1-£,)F,-1  is 

/,(x;+Ax,(8),Q)  -  /j(x,,0)  -  f  /,(x,0)«6c,  (6.5) 

X, 

Now,  />,(x,)— 0  implies  that 

..  /’(x,+Ax,(8),0)  „ 

lim  ■,  ■  -  ■'■■■■  -  0.  (6.6) 

ix.w-o  /((x,+Ax,(8),l) 

It  follows  that  the  first  integrand  above  can  be  made  uniformly  larger  than  the  second  by 
an  arbitrarily  large  multiplicative  factor  by  talcing  Ax,  (8)  small  enough,  which  in  turn  can 
be  accomplished  by  taking  8  small  enough,  since  Ax,  (8) — 0  as  8 — 0  by  the  assumed  con¬ 
tinuity  and  finiteness  of  /,(x,l)  near  x-x,  in  this  case.  Since  the  first  integral  is  by 
definition  proportional  to  8,  the  second  integral  must  therefore  be  proportional  to  o(8). 
Let  corresponding  definitions  be  made  for  Ax,  (8)  if  /*(x,)— 1.  In  this  manner,  a  vector  of 
changes  is  obtained  for  which  Ax,(8)>0  [<0i  if-and-only-if  z, >0  [<0].  Furthermore, 
for  both  values  of  £,  either 

/,(x,+Ax,(8),£)  -  /,(x,,£)  -  (6.7) 

or. 

/,(x,+-Ax,(8),£)  -  /,(x„£)  -  8 (p,-£)r,  +  o(8).  (6.8) 

Therefore,  for  every  j,  either 

j^x+Axtt))  -  s/x)  -  -o°,  (6.9) 


s,(x+Ax(8))  -  Sj (x)  -  8f,(z;p)  t  o({),  (6.10) 

Now,  by  the  assumption  that  a  is  a  "sure-win"  ("can’t-lose’l  bet,  t(t;p)<0  (<0);  so  that 


by  taking  5  small  enough  the  score  change  can  be  made  negative  under  every  outcome 
[negative  and  proportional  to  6  under  at  least  one  outcome,  and  proportional  to  o(S) 
under  the  remaining  outcomes]  which  proves  that  x  is  not  [strictly]  admissible. 

Thus  it  is  seen  that  a  necessary  condition  for  a  set  of  choices  to  be  [strictly]  admissible  (in  fact, 
locally  so)  is  for  the  probability  transforms  to  be  [strictly]  coherent  bet  prices,  which  in  turn 
implies  the  existence  of  a  probability  distribution,  w,  and  a  measure  based  on  it,  P„  for  which 
these  are  conditional  probabilities.  Considering  the  score  due  to  the  choice  x  as  a  random  vari- 
able,  denoted  S(x),  the  expected  value  of  Six)  under  the  probability  measure  P„  is  given  by 

£„(5(x))  -  w's(x)  -  £P.(/>,(*„P.(£,|/;))  (6.11) 

for  all  x,  where  r,( x,p)  is  the  expected  partial  score  function  defined  in  the  last  section.  If  x  is 
a  vector  whose  probability  transform  is  consistent  with  the  distribution  w-i.e.,  for  which 
P,U)-P„(£,|/;)  for  every  /-then  ri(jr„Pw(£|F,))-0,  so  that  all  the  derivatives  of  the 
expected  score  function,  evaluated  at  x,  are  zero.  Theorem  3  can  therefore  be  paraphrased  as 
follows:  x  is  [strictly]  admissible  only  if  there  exists  a  [positive]  probability  distribution  on  the 
sample  space  for  which  the  gradient  of  the  expected  score,  evaluated  at  x,  is  the  zero  vector. 
The  gradient  being  the  zero  vector  is,  of  course,  a  first-order  necessary  condition  for  an  uncon¬ 
strained  minimum  of  a  smooth  function.  Thus,  a  necessary  condition  for  [strict]  admissibility 
is  that  x  must  satisfy  a  first-order  condition  for  a  minimum  of  the  expected  score,  under  a 
[positive]  probability  distribution  on  the  sample  space.  On  the  other  hand,  consideration  of  the 
properties  of  a  minimum  of  the  expected  score  leads  to  sufficient  conditions  for  admissibility 
or  strict  admissibility.  A  choice  x  which  minimizes  the  expected  score  under  the  distribution  w 
is  said  to  be  "Bayes  against  w.*  x  will  simply  be  described  as  Bayes  if  it  is  Bayes  against  some  w, 
and  strictly  Bayes  if  it  is  Bayes  against  some  w  >  0.  In  these  terms,  we  have: 

THEOREM  4:  x  is  [strictly]  admissible  if  it  is  [strictly]  Bayes. 

Proof:  For  the  non-strict  case  this  result  is  obvious-  an  alternative  choice  yielding  a 
lower  score  under  every  outcome  would  also  yield  a  lower  expected  score  under  every 
probability  distribution  on  the  sample  space.  For  the  strict  case,  assume  x  is  Bayes  against 
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some  positive  probability  distribution,  and  let  denote  the  least  element  of  this  distri¬ 
bution.  Then  suppose  that  x  is  not  strictly  admissible,  i.e.,  that  for  every  positive  «,  no 
matter  how  small,  there  exists  an  alternative  choice  which  lowers  the  score  by,  say,  As 
under  one  outcome  without  raising  the  score  by  more  than  «Ai  under  any  other  outcome. 
Thus  the  score  can  be  lowered  by  As  under  some  outcome  with  probability  greater  than 
or  equal  to  v  Under  all  other  outcomes,  the  score  is  raised  (if  at  all)  by  not  more  than 
cAs,  and  the  total  probability  of  these  other  outcomes  cannot  be  more  than  1 By 

W 

choosing  <  <  ~r. — =~-,  this  alternative  choice  can  therefore  be  made  to  have  a  lower 

d-w*J 

expected  score  than  x. 

THEOREM  5:  If,  for  every  /,  the  probability  transform  P,  is  strictly  increasing  on 
Ufo7],  and  />,-/*,  (x,),  then  the  following  are  equivalent: 

(< )  p  is  [strictly]  coherent; 

(/<)  x  is  [strictly]  Bayes; 

(Hi)  x  is  (strictly]  admissible. 

Proof:  Suppose  p  is  [strictly]  coherent.  Then,  by  Theorem  2,  there  exists  a  [positive] 
probability  distribution,  w,  for  which  />,-?„(£,  |f),  or  else  P„(F,)-0,  for  every  /.  Since 
P,-Pt(x,)  and  P,  is  strictly  increasing  on  U fxfl,  x,  uniquely  minimizes  r,(-,p,).  From 
the  representation  of  the  expected  total  score  given  in  Equation  (6.11),  it  follows  that  x  is 
Bayes  against  w.  Thus,  x  is  [strictly]  Bayes  if  p  is  [strictly]  coherent.  By  Theorem  4,  x  is 
[strictly]  admissible  if  it  is  [strictly]  Bayes.  Finally,  by  Theorem  3,  p  is  [strictlyj  coherent 
if  x  is  [strictly]  admissible. 

This  theorem  completes  the  generalization  of  de  Finetti’s  notion  of  the  equivalence  of  betting 
systems  and  scoring  rules  as  methods  for  defining  and  measuring  subjective  probability,  which 
was  recently  proved  by  Lindley  (1980)  in  a  weaker  form.  It  has  been  shown  that  the  require¬ 
ments  of  [strict]  admissibility  for  conditional  uncertainty  assessments  under  generalized  scoring 
rules  with  strictly  increasing  probability  transforms  give  rise  to  the  probability  laws  in  the  same 


way  as  the  requirements  of  [strict]  coherence  for  conditional  bet  prices.  If,  however,  the  proba¬ 
bility  transforms  are  not  all  strictly  increasing,  then  the  existence  of  a  probability  measure  con¬ 
sistent  with  the  probability  transforms  of  the  choices,  which  is  a  necessary  condition  for  admis¬ 
sibility  according  to  Theorem  3,  is  not  also  a  sufficient  condition.  An  example  of  the  latter 
situation  is  provided  by  the  trigonometric  scoring  rule  introduced  previously,  for  the  case  in 
which  fc— 1.  For  some  event  £,  let  xx  and  x2  be  chosen  to  describe  the  unconditional  uncer¬ 
tainty  of  £  and  l-£,  respectively,  both  under  this  scoring  rule.  For  either  value  of  £,  the  total 
score  is  then  given  by  /(xi,£;1)+/(x2,1-£;1),  where  /(x,£;Ar)  is  defined  by  Equation  (5.4). 
From  Theorem  3,  a  necessary  condition  for  (xi,x2)  to  be  admissible  is  £(xi;1)+P(x2;1)“1, 
where  P(x;k)  is  given  by  Equation  (5.5).  For  some  p  in  (0,1),  consider  all  pairs  (xhx2)  which 
meet  the  above  condition  by  satisfying  £(xi;l)-p  and  £(x2;l)«l~p.  There  are  nine  such  dis¬ 
tinct  pairs,  corresponding  to  the  combinations  of  the  three  solutions  for  x,  and  the  three  solu¬ 
tions  for  x2,  as  noted  at  the  end  of  Section  5.  If  p*y,  exactly  one  of  these  nine  pairs  is  Bayes 

against  w-(p.l-p),  namely  the  unique  pair  of  which  one  element  lies  in  (0,-g-)  and  the  other 

element  lies  in  (4*,1).  (This  pair  is  admissible,  by  Theorem  4.)  Also,  exactly  one  pair  is  inad- 
0 

missible,  namely  the  unique  pair  of  which  both  elements  lie  in  (y,y).  (This  can  be  demon¬ 
strated  by  a  simple  geometrical  argument,  based  on  the  fact  that  both  /(x,0;l)  and  /(x,l;l) 
are  concave  on  (y,y).)  The  remaining  seven  pairs  are  also,  in  fact,  admissible,  even  though 

none  of  them  is  Bayes  against  any  w.  If  p-y  there  are  four  admissible  pairs  which  are  Bayes 

(x(,x2  <  ■!■}),  one  inadmissible  pair  (xi-x*--^),  and  four  admissible  pairs  which  are  not 

0  0  2 

Bayes. 

The  above  results  can  be  extended  to  the  case  in  which  assumptions  A2  and  A4  are 
relaxed  to  allow  scoring  functions  whose  derivatives  are  only  piecewise  continuous.  In  particu¬ 
lar,  assume  that  /,(•,<))  and  /,(•,!)  have  piecewise  continuous  derivatives  such  that  the 
corresponding  probability  transform  is  piecewise  continuous,  and  satisfies  P,~(x)  <  P*(x)  for 


all  x  in  [xAx,7],  and  also  P,*(xf)<\  and  P,“(x(r)> 0,  where  P,~(x)  and  P*(x)  denote  the  limits 
from  the  left  and  right,  respectively,  of  P,  at  x.  Then,  by  the  same  same  procedure  as  in  the 
proof  of  Theorem  3,  invoking  the  results  of  Theorem  2*  rather  than  Theorem  2,  it  can  be 
shown  that  a  choice  x  is  [strictly]  admissible  only  if  (p~,p*)  is  a  set  of  [strictly]  coherent 
buying/selling  bet  prices,  where  pf-PfGr,)  and  p?~P,* GO  for  ail  »,  with  Pr(xf)& 0  and 
P+(xf)s  1.  If  P,  is  also  strictly  increasing  for  every  then  a  corresponding  generalization  of 
Theorem  S  is  obtained.  Thus,  upper  and  lower  probabilities  can  also  arise  in  the  context  of 
scoring  rules. 

Blackwell  and  Girshick  (19S4)  give  numerous  admissibility  results  for  statistical  games, 
using  a  definition  of  admissibility  which  is  intermediate  in  strength  between  admissibility  and 
strict  admissibility  as  defined  in  this  paper.  (In  particular,  their  definition  of  admissibility, 
which  follows  Wald  (I9S0),  is  obtained  by  substituting  for  *>’  in  the  definition  of  admissi¬ 
bility  at  the  beginning  of  this  section.)  An  *S-game  against  nature*  is  a  statistical  game  deAned 
by  a  set  5  in  m -space,  in  which  the  player  chooses  a  strategy  consisting  of  a  vector 
s—(i|, . . . ,  sm)  in  5,  and  ’nature’  then  randomly  chooses  a  coordinate,  J.  whereupon  the 
player  receives  a  loss  of  st.  The  process  of  eliciting  conditional  uncertainty  assessments  for  a 
set  of  event  pairs  under  generalized  scoring  rules,  as  described  in  this  section,  is  clearly  a  spe¬ 
cial  kind  of  S-game  against  nature,  in  which  the  set  5  consists  of  all  s(x)  generated  by  (6.1)  for 

values  of  x  satisfying  xf  <  x;  <  x,T,  /'-I . n.  This  set  is  closed,  but  not  generally  convex: 

although  if  the  probability  transforms  of  the  scoring  rules  are  ail  strictly  increasing,  then  the 
admissible  points  lie  on  a  convex  boundary.  Blackwell  and  Girshick  show  that  for  S-games  in 
which  5  is  closed  and  convex,  with  the  ’<*  definition  of  admissibility,  every  strictly  Bayes  stra¬ 
tegy  is  admissible,  and  every  admissible  strategy  is  Bayes.  This  result  is  applicable  to  the  case 
of  scoring  rules  with  strictly  increasing  probability  transforms,  but  it  is  less  specific  than 
Theorem  S.  However,  by  a  direct  application  of  the  basic  separation  theorem,  it  can  be  shown 
that  in  every  S-game  which  is  closed,  convex,  and  bounded  below,  a  strategy  is  (strictly]  admis¬ 
sible  if-and-only-if  it  is  (strictly]  Bayes,  under  the  definitions  used  here. 
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